Residential exposure to radon and DNA methylation across the lifecourse: an exploratory study in the ALSPAC birth cohort [version 1; peer review: 2 approved, 1 not approved]

Background: Radon (and its decay products) is a known human carcinogen and the leading cause of lung cancer in never-smokers and the second in ever-smokers. The carcinogenic mechanism from radiation is a combination of genetic and epigenetic processes, but compared to the genetic mechanisms, epigenetic processes remain understudied in humans. This study aimed to explore associations between residential radon exposure and DNA methylation in the general population. Methods: Potential residential radon exposure for 75-metre area buffers was linked to genome-wide DNA methylation measured in peripheral blood from children and mothers of the Accessible Resource for Integrated Epigenomic Studies subsample of the ALSPAC birth cohort. Associations with DNA methylation were tested at over 450,000 CpG sites at ages 0, 7 and 17 years (children) and antenatally and during middle-age (mothers). Analyses were adjusted for potential residential and lifestyle confounding factors and were determined for participants with complete data (n = 786-980). Results: Average potential exposure to radon was associated in an exposure-dependent manner with methylation at cg25422346 in mothers during pregnancy, with no associations at middle age. For children, radon potential exposure was associated in an exposuredependent manner with methylation of cg16451995 at birth, cg01864468 at age 7, and cg04912984, cg16105117, cg23988964, cg04945076, cg08601898, cg16260355 and cg26056703 in adolescence. Conclusions: Residential radon exposure was associated with DNA Open Peer Review


Introduction
Radon is a noble gas with no stable isotopes. Radon-222 (half-life (t 1/2 ) 3.82 days) and radon-220 (t 1/2 55.2 seconds) are found in the environment as components of the radioactive decay chains of the naturally occurring, long-lived radionuclides uranium-238 and thorium-232, respectively, which are found to varying extents in all rocks and soil. Radon-222 is the product of the decay of long-lived radium-226 (t 1/2 , 1600 years), and in most parts of the world, following their accumulation in enclosed spaces, inhalation of 222 Rn and its short-lived radioactive decay products is the largest contribution of human exposure to ionising radiation. Globally, inhalation of 222 Rn and its progeny is estimated to provide nearly half of the average annual effective dose (the radiation-and tissue-weighted whole-body absorbed dose) of 2.4 mSv from natural sources of ionising radiation 1 . However, the geographical variation of the effective dose from 222 Rn and its progeny is considerable, with a typical range of 0.2-10 mSv per annum. The contribution to the global average annual effective dose from the inhalation of 220 Rn and its progeny is much less at 0.1 mSv.
Radon-222 and some of its short-lived progeny deliver most of their radiation dose through short-range alpha-particle emission and, following inhalation, the radiation dose is received primarily by the bronchial epithelium. There is compelling epidemiological and experimental evidence that 222 Rn and its decay products (hereafter, "radon") cause lung cancer, with exposureresponse associations approximately linear with no evidence of a threshold 2,3 , and radon has been classified as a Group 1 carcinogen ("carcinogenic to humans") by the International Agency for Research on Cancer (IARC) 4 .
Exposure to radon is considered the second leading cause of lung cancer after tobacco smoking, and the principal cause in never-smokers 5,6 . The fraction of the lung cancer burden attributable to indoor exposure to radon ranges from 3% to 14% across the world, and is estimated at 3.3% 7 , or 1,100 lung cancer deaths annually, in the UK specifically 3 . Once inhaled, radon gas itself is mostly exhaled again, but a large proportion of the inhaled short-lived radon progeny deposits in the airways of the lungs with the alpha-particles emitted by 218 Po and 214 Po dominating the dose to the lung. In contrast, radon gas transported from the lung makes a larger contribution than its decay products to doses to organs/tissues other than the lung, particularly those with a comparatively high fat content (including the red bone marrow (RBM)). However, the evidence for radon causing cancers other than lung cancer is limited and relates to the fact that doses to other tissues from radon are relatively small. For example, the UK average annual equivalent dose (the radiation-weighted absorbed dose) to the RBM from radon is 80 µSv (children and adults) as compared to the RBM dose of 1430 µSv (5-year old) and 1070 µSv (adult) from all-natural sources 8 ; this RBM dose from radon compares with that to the lung of 10,000 µSv.
Worldwide, the population-weighted geometric mean indoor level of radon is estimated to be 30 Bq m -3 9 , with a large geographical variation 3 . In England, the concentration in homes is about 20 Bq m -3 on average, but it ranges from 5 to 10,000 Bq m -3 and more in some radon-prone areas; for comparison, the average outdoor concentration is 4 Bq m - 3 2 . Variation between and within small geographical areas, as well as over time, can be the result of many factors including the abundance of 226 Ra in the ground, fissuring of rocks, permeability of the soil, openings in the foundations of buildings through which radon can enter, and the extent to which a particular structure retains radon, including ventilation 3,10 . In Great Britain, a strong correlation between domestic radon levels and socio-economic status (SES) has been observed, where lower SES residences have, on average, only two-thirds of the radon levels of those of the more affluent, which may be related to greater underpressure in warmer and bettersealed houses 11 . Because people spend a significant portion of their time indoors, homes are typically the primary source of indoor radon exposure 3 , and within houses concentrations can also widely vary, with (in the USA) concentrations typically 50% higher in basements compared to the ground floor 12 .
The World Health Organisation (WHO) and International Commission on Radiological Protection (ICRP) recommend radon reference levels for homes in the range of 100-300 Bq m -3 13 , with the ICRP reference level of 300 Bq m -3 having been incorporated as the upper limit for the reference level by the European Union 14 . The annual effective dose for a dwelling at 300 Bq m -3 , and given several assumptions, is estimated at about 14 millisievert (mSv) 15 . In the UK, Public Health England recommends that indoor radon levels should be below 200 Bq m -3 (averaged over the home; the Action Level), which corresponds to about 12 mSv annual effective dose 2 , with 100 Bq m -3 being considered the Target Level for remediation work and for new buildings 2 .
The multistage carcinogenic process is in all probability a mixture of genetic and epigenetic process. Ionizing radiation, in addition to producing mutations mainly by gene deletion and gross chromosomal damage, can also induce epigenetic effects 4 . Residential radon exposure has been associated with DNArepair gene polymorphisms in adults 16 and children 17 , with the latter study also reporting double-strand break repair gene polymorphisms. Epigenetics describe heritable chemical modifications of DNA and chromatin affecting gene expression, and include DNA methylation, histone modifications and micro-RNAs which can act in concert to regulate gene expression 18 . In addition, the 'bystander effect', in which cells that are not directly irradiated, but are in the neighbourhood of cells that have, also exhibit phenotypic features of genomic instability that is considered to be epigenetic in nature 4 . DNA methylation is the most stable and most readily quantifiable epigenetic marker and is sensitive to pre-and postnatal exogenous influences 19 . However, there is only limited data on effects of radon exposure on DNA methylation in humans, with some evidence from high exposed uranium miners in China 20 .
This study aims to explore whether there is evidence of DNA methylation from residential radon exposure in the general population and assesses whether any methylation varies across the lifecourse.

Data
This study used data from the Avon Longitudinal Study of Parents and Children (ALSPAC) 21,22 . ALSPAC recruited 14,541 pregnant women with expected delivery dates between April 1991 and December 1992, which resulted in 14,062 live births of which 13,988 children were alive at 1 year of age. Details of all data searchable though are provided at the ALSPAC data dictionary.
A sub-sample of 1,018 ALSPAC mother-child pairs had DNA methylation measured using the Infinium HumanMethylation450 BeadChip (Illumina, Inc.) 23 as part of the Accessible Resource for Integrated Epigenomic Studies (ARIES) project 24 . For this study DNA methylation data generated from cord blood, venous blood samples at age 7 years and again at age 15 or 17, and additionally from the mothers during pregnancy and at middle age were used. All DNA methylation analyses were performed at the University of Bristol as part of the ARIES project and has been described in detail previously 24 .
Ethical approval for this study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees (Reference B2805).
DNA methylation DNA methylation profiles for ALSPAC children were generated at birth from cord blood and in childhood from peripheral blood at ages 7 and 15-17 years using the Illumina Infinium HumanMethylation450 BeadChip as part of the Accessible Resource for Integrated Epigenomic Studies (ARIES) 25 . DNA was bisulphite-converted using the Zymo EZ DNA Methylation TM kit (Zymo, Irvine, CA). Infinium HumanMethylation450 BeadChips (Illumina, Inc.) and used to measure genome-wide DNA methylation levels at over 485,000 CpG sites. The arrays were scanned using an Illumina iScan, with initial quality review using GenomeStudio (version 2011.1). This assay detects methylation of cytosine at CpG islands using one probe to detect the methylated and one to detect the unmethylated loci. Singlebase extension of the probes incorporated a labelled chain-terminating ddNTP, which was then stained with a fluorescence reagent. The ratio of fluorescent signals from the methylated site versus the unmethylated site determines the level of methylation at the locus.
Quality control and normalization of the profiles was performed using the meffil R package (version 1.1.0) as previously described 26 . The level of methylation is expressed as a percentage (β-value) ranging from 0 (no cytosine methylation) to 1 (complete cytosine methylation). Finally, to reduce influence of outliers in statistical models, normalized β-values were 90%-Winsorized.

Radon exposure
Potential residential radon exposure is available from the Health Protection Agency (HPA; now Public Health England) -British Geological Survey (BGS) 'radon potential dataset for Great Britain', and was obtained for the Avon area (which includes the original ALSPAC catchment area) from BGS after a data sharing agreement was agreed by BGS and the PI's Institute. Estimates of potential radon exposure were based on long-term radon measurements from 479,000 homes across Great Britain and provided with a spatial resolution of 75-metre buffers as the percentage of dwellings exceeding the 200 Bq m -3 Radon Action Level for 75-metre buffers in 6 classes: 1 (0-1%), 2 (1-3%), 3 (>3-5%), 4 (>5-10%), 5 (>10-30%) and 6 (>30-100%). More information is available at: http://www.bgs.ac.uk/ radon/hpa-bgs.html. To assess measurement error, we also linked the ARIES dataset to estimates from the freely available radon 'indicative atlas' 27 , which is based on the 'potential dataset', but provides the estimates in 1-km-side squares.
Residential histories of mothers and children were geocoded to postcode centroid level, and were linked to average potential radon exposure using ArcGIS software (version 10.6) 28 within the ALSPAC Data Safe Haven. This resulted in at least one address match for 986 mothers and 1001 young people (including two sets of twins). Once each residential address had a radon potential exposure class assigned, time spent at each address was calculated. This was merged with ARIES sample prevision dates, allowing time-weighted average potential radon exposures to be calculated up to the 'mothers at middle age', 'children at 7' and 'children at 15/17' sample extraction time points. For the cord and antenatal sample extractions, radon exposure potential of address at date of birth or closest address (temporally) to sample time point were assigned respectively.
These data were then linked to ALSPAC self-reported data selected to test for potential confounding (described below). After the linked exposure data were processed to minimise the risk of participant disclosure, the linked methylation-radon data were used for statistical analyses.

Statistical methods
For these analyses we only use participants with complete data. In the primary analyses average potential radon exposure was analysed as a continuous variable (range 1-6) to assess linear exposure-response associations. In addition, we also analysed associations based on binary exposure classifications (≤5% vs >5%).
Associations were tested using linear models using the limma R package (version 3.32.10) 29 . Associations were tested in (1) univariate analyses but with adjustment for the surrogate variables 30 to handle batch effects, sex differences, cell count heterogeneity and possible unknown confounders 31 , and (2) additionally with adjustment for potential confounding factors maternal age at birth, maternal BMI, smoking during pregnancy, partner smoking during pregnancy, AHRR CpG site that detects own smoking nearly as accurately as self-report 32 , mother alcohol intake in early pregnancy, equivalized income, parental occupation, and parental education, and (3) all factors of models 1 and 2 and additionally, for damp problems, central heating, boiler location, gas cooking, time windows open in the summer/winter day/night, and heavy traffic.
Associations at false discovery rate (FDR) less than 20% calculated using the q method 33 are reported. Where associations between MRB and methylation was positive, these sites were defined as "hypermethylated" and conversely when for inverse associations, these were defined as "hypomethylated".

Participants and location
The results were based on 786 to 980 participants with complete information, depending on the time analyses. A graphical overview of the geographical study area and the distribution of potential radon exposure classes, as well as the distribution of addresses in each class, is shown graphically in Figure 1, and indicates that 79% of addresses are in areas with low probability (class 1 and 2) of exposure >200 Bq m -3 .

CpG sites
Results for CpG sites with FDR <0.20 are shown in Table 1. In mothers, average potential exposure to radon was only associated in an exposure-dependent manner with hypomethylation of cg25422346 during pregnancy (p = 1.1x10 -8 , FDR = 0.005), with no associations observed at middle age. For the children, radon potential exposure was associated in an exposuredependent manner with hypomethylation at cg16451995 at birth (p = 3.2x10 -7 , FDR = 0.16) and with hypermethylation of cg01864468 at age 7 (p = 1.1x10 -8 , FDR = 0.005). In adolescence (age 15-17) there was evidence of exposure-dependent methylation at several CpG sites. Cg04912984, cg16105117 and cg23988964 were hypermethylated with increased potential exposure, while cg04945076, cg08601898, cg16260355 and cg26056703 were hypomethylation proportionally to average potential radon exposure. The same CpG sites at the same timepoints were identified when average potential exposure to >200 Bq m -3 was dichotomized into low (≤5%) and high (>5%) probability (Table 2), and similarly when using another cut-off (≤3% vs >3%); data not shown.
Regardless of exposure metric, there is little evidence of significant confounding with directions and sizes of associations similar for univariable and both multivariable models.
To assess the impact of measurement error, the same analyses were repeated but with exposure based on the 'indicative radon atlas' using 1-km 2 spatial resolution (Table 3). Results were comparable to those based on the 75-m buffers.

Discussion and conclusions
In this exploratory study we aimed to investigate associations between residential exposure to radon in the general population and DNA methylation. Associations were observed with increasing probability of average potential exposure of the residence over 200 Bq m -3 in children at birth, age 7 and during adolescence, with single CpG sites affected at birth (cg16451995) and age 7 (cg01864468) and seven sites affected at age 15-17 (cg04912984, cg04945076, cg08601898, cg16105117, cg16260355, cg23988964, cg26056703) after adjustment for important confounding factors. These also did not depend on the choice of cut-off used. For mothers an association with hypomethylation of cg25422386 was observed during pregnancy, but not at a later time point during middle age. To our knowledge, this is the first study identifying associations between radon exposure with methylation patterns in a general population.
Locations of the affected CpG sites of the children were on the PMM2 gene (cordblood), associated with abnormalities in amniotic fluid and congenital disorders, upstream of HCG14 (age 7), involved in the development of lung carcinoma, and NDRG2 and SGPL1, associated with glioblastoma development and Alzheimer's disease and nephrotic syndrome, respectively, LINC01197, as well as upstream of VANGL1, associated with congenital disorders, and FAM71A genes (age 15-17), while for mothers cg25422346 is located upstream of SMIM31. These methylation patterns, describing both hyper-and hypomethylation associated with potential residential radon exposure, have not been reported elsewhere. In a candidate gene study of Chinese uranium miners, the authors reported increased methylation of promotor regions of p16 INK4a and O 6 -MGMT genes, as well as increased total methylation rate, depending on cumulative radon doses 20 , and a study using BEAS-"B human lung cells exposed to 20,000 Bq m -3 radon for 30 minutes showing global hypomethylation and hypermethylation of candidate CpG-sites at PTPRM and EDA2R genes 34 . Similarly, these genes have also not come up in candidate gene studies of exposure to radon, in which gene-environment interactions with p53 35 , GSTM1 and GSTT1 36 , hOGG1 and APE1 37 , ADPRT 38 , XPG, ADPRT and NBS1 16 , LIG4 39 , and NBS1 and ATM1 have been reported. Possible explanations for the different genes for which hyper-or hypomethylation was associated with potential radon exposure in this study compared to other studies, may be that CpG sites identified in this study are involved in the 'bystander effect', rather than the result of direct irradiation, they may be a marker of earlier biological effects, it may be because methylation was measured in blood rather than in lung tissue, and of course residual confounding or chance findings can also not be completely excluded.
This study has several limitations. Most importantly, the exposure metric used in this study is a relatively weak one. It is not generally possible to accurately predict indoor radon concentrations for specific buildings without individual measurements 3 . Although people spend most of their time indoors at home, estimates are based on the modelled probability that a dwelling in the 75-m buffer that includes a person's home has a radon concentration exceeding 200 Bq m -3 . Because of high spatial and temporal variability 40,41 this will inevitably have led to considerable misclassification. Assuming measurement error in this case is non-differential, generally resulting in bias to the null, it is interesting that exposure-response associations were still observed in this study with a relatively small sample size. Furthermore, the possibility of misclassification of radon exposure should affect all participants in a similar way and is unlikely to bias associations with DNA methylation.
Although there was little evidence of significant confounding in these analyses, residual confounding explaining these findings cannot be excluded. For example, rurality is a known  confounding factor for studies on radon 41 . However, within ALSPAC and certainly within Avon there are few true 'rural' residential areas as the area quite heavily populated, so it is unlikely this will bias associations significantly. We also had no information on whether participants lived in houses or apartments (and in the latter case on which floor) 40 or whether houses had a basement 12 , which will have added to further measurement error.
There are known limitations in quality of the ALSPAC residential address history data in terms of missingness and gaps; although in this study the impact of this will be limited as the postnatal ARIES sample dates are linked to direct contact with participants where address details would have been validated. However, to enable assignment of potential radon exposure to individuals over periods of unknown residence, remediation was carried out by (a) setting the address start date to child date of birth where first address start date fell after child date of birth, which is a reasonable assumption because often the address start date represents a data capture date as opposed to an actual move date, and (b) by rectifying all other temporal gaps by calculating a mean radon potential exposure class based on the radon potential at the preceding and succeeding addresses.
The current analysis lack directly measured blood-cell-type proportions, and we therefore included cell count heterogeneity using estimates obtained using surrogate variable analysis in the models 30 . This approach has been found to perform just as well or better 31 than the more commonly used method of Houseman et al. 42 . In this case, it probably performs better in DNA methylation profiles generated from childhood peripheral blood because DNA methylation references are available only for adult blood 43 and cord blood. Levels of methylation vary between tissue types and may relate differently to traits and exposures, which may limit inferences from this study. In the current study we have methylation from blood samples, but it may have been beneficial had we been able to test associations in a more relevant cell type such as the lung.
And finally, this study had reduced statistical power due to the relatively limited sample size of the ARIES sub-sample, which was further diminished as a result of missing values. Alcohol consumption for adolescents could not be included as a potential confounding variable because this was only available for less than 200 teens. Because current approaches for multiple imputation are not feasible for genomic datasets including hundreds of thousands of measured variables (our study included variables corresponding to DNA methylation levels at over 480,000 CpG sites) we did not apply multiple imputation to increase sample size. In future, when feasible approaches have been developed, we plan to revisit these analyses.
The main strength of this study is the unique resource which allowed for the assessment of genome-wide methylation profiles at different time points linked to detailed phenotypic characterisation, which enabled assessment of the temporality of associations. In these analyses we used three cross-sectional models to compare methylation patterns at birth, age 7 and in adolescence, but with better characterization of the dynamic elements of the human methylome 44 , longitudinal analyses will help to better elucidate persistent and reversible effects of (environmental) exposures as well as critical periods of effect 45 . Information on epigenetic signals across the life-course and radon exposure are of interest because they have the potential to describe early biological effects, and the estimated induction (lag) period of lung cancer to radon exposure is between 5 and 25 years 1 .
In conclusion, this exploratory study is, to our knowledge, the first study to study association between genome-wide DNA methylation and (potential) residential exposure to radon. Despite the relatively weak exposure metric, differential methylation associated with increased potential residential radon exposure was observed prenatally in mothers, for children at birth, age 7, and especially at age 15-17, but not for the mothers in middle age. Future work in a larger population, with replication in an independent sample, and using a radon exposure estimation methodology, most notably personal exposure measures, can further elucidate these associations.

Data availability
The potential residential radon exposure was provided by the British Geological Survey (BGS) under license for the current study (Licence number 2017/017RAD ED British Geological Survey © NERC. All rights reserved) and can be requested from BGS (http://www.bgs.ac.uk/radon/hpa-bgs.html). Details on who will be granted access to the data, and whether there will be a charge for data access can be found online.
ALSPAC data access is through a system of managed open access. Full details of all available data can be accessed through a fully searchable data dictionary provided on the ALSPAC study website (http://www.bris.ac.uk/alspac/researchers/data-access/datadictionary) and the steps below highlight how to apply for access to both the data included in this data note and all other ALSPAC data. The datasets presented in this data note are linked to ALSPAC project number B645; please quote this project number during your application. The ALSPAC variable codes highlighted in the dataset descriptions can be used to specify required variables.
1. Please read the ALSPAC access policy (PDF, 627kB) which describes the process of accessing the data and samples in detail and outlines the costs associated with doing so.
2. You may also find it useful to browse our fully searchable research proposals database, which lists all research projects that have been approved since April 2011.
3. Please submit your research proposal for consideration by the ALSPAC Executive Committee using the online process. You will receive a response within 10 working days to advise you whether your proposal has been approved.
If you have any questions about accessing data, please email alspac-data@bristol.ac.uk.
The ALSPAC data management plan describes in detail the policy regarding data sharing, which is through a system of managed open access.

General comments
This is a generally well written paper. However, the analysis is defective in a number of ways. In the first instance, arguably the wrong thing is being analysed. Probably cumulative radon exposure up to a given age is the variable more likely to be relevant than instantaneous radon exposure level; one would expect this to correlate with radon exposure rate fairly highly at young age, but much less well at older ages. This may have something to do with why the measures of the mothers in middle age found nothing. Analysis of radon using the 6-level probability-ofexposure-above-action-level metric throws away data and may reduce power. Analysis should use continuous radon concentration levels. Arguably the false discovery rate (FDR) threshold is set too high. Sensitivity analysis should be conducted using Bonferroni multiple-comparison correction of p-values rather than FDR.
There should be more discussion of the plausibility of these findings, in particular the lack of internal consistency, associated with the fact that different CpG sites are found at different ages, and the lack of external consistency, associated with the fact that none of the sites identified in this MS are the same as sites previously identified after much higher-level radon exposures.

Specific comments (page, column, line)
p.4 col. 2 l.7-10: It would be better to analyse radon using the mean radon concentration as a continuous variable, rather than as a categorical variable, recording percentages above the action level (200 Bq m -3 ), which throws away data and will also reduce power. There is also a degree of arbitrariness in how one chooses the 6 groups. I assume that radon exposure is measured instantaneously at various ages. As above, arguably cumulative lifetime radon exposure up to the age at blood measurement may be the more relevant exposure measure. This should be evaluated, if possible.
p.4 col. 2 l.11-14: It is not clear how this really helps assess measurement error. It should be regarded more as a type of sensitivity analysis.
p.4 col. 2 l.-19: Is the 6-level radon variable described previously on this page really being analysed as a continuous variable? If so, this is extraordinarily unwise, as it will introduce non-linearities into the model. The radon concentration should either (as above) be analysed as a continuous measure (my preference), or if you are going to define this 6-level variable it must be analysed as a factor variable. p.4 col. 2 l.-18 --16: To further collapse the radon exposures into a <5% vs >5% variable is arguably even more a degradation of this data. I don't think this should be reported.
p.4 col. 2 l.-12 --5: It would be helpful for all these variables to indicate where this information came from. Presumably some (e.g., parental income, parental education, tobacco and alcohol consumption), are assessed via questionnaire, others (e.g., heaviness of traffic), possibly via linkages with other datasets. Also, the levels used to categorize these variables, where they are not simply linear variables in the model, needs to be defined.
p.5 col. 1 l.16: An FDR of 0.20 is too high. A more conservative choice would be 0.05, or even 0.01, and at least the higher of these should be the default, with the 0.01 value (possibly also the 0.20 value) used for sensitivity analysis. Another sensitivity analysis that should be performed is to use Bonferroni multiple-comparison p-value adjustment, although this may be too conservative.
p.5 col. 1 l.-16: As above, I don't think this really assesses the impact of measurement error. It should be regarded more as a type of sensitivity analysis.
p.5 col. 1 l.-6 --3: The plausibility of there being a radon methylation effect at birth (implying transplacental transmission of radon) should be discussed. The fact that at different ages completely different CpG sites are affected also seems implausible, and again needs discussion.
p.5 col. 2 l.17-21: It should be pointed out that radon levels in this Chinese uranium miner study are likely to exceed those here by a substantial amount (by at least an order of magnitude, and probably more).
p.5 col. 2 l.-11 --5: It is not the case that "Assuming measurement error in this case is nondifferential, generally resulting in bias to the null … misclassification of radon exposure will affect all participants in a similar way and is unlikely to bias associations with DNA methylation". First, these two sentences seem to imply different things (of bias and its lack)! If a classical error model is assumed then the associations would be biased towards the null (see R Carroll et al Measurement error in non-linear models. A modern perspective 1 ), as implied by the first (but not the second) of these sentences. If on the other hand a Berkson error model is assumed then estimates will be approximately unbiased, although the variance will be underestimated (Zhang et al PLoS ONE 2017 2 ), as implied by the second (but not the first) sentence.

© 2019 Hung R.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Rayjean J. Hung
Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada The authors investigated whether indoor radon exposure is associated with any DNA aberration based on maternal blood samples during pregnancy, coredblood, offspring's peripheral blood samples age 7 and 15, within the ALSPAC birth cohort. They reported several different CpG sites to be associated with radon exposures at different time points based on FDR of 20% as threshold.
The main comments are summarized below: Citing current literature: The majority of the background section is radon exposure. While it is necessary to cover the property of radon exposures and measurements, more specific context related to methylation and radon, or more generally with ionizing radiation would be more informative for the readers. 1.

Potential trans-generational effect:
With the multigenerational data in ALSPAC, one of the most interesting possibilities is to investigate whether radon exposures confer any transgenerational effects on methylation. Two of the relevant citations based on animal models were provided above. It would be valuable to assess if the methylation aberration induced by radon exposures persist across generations. 2.

Statistical Analysis:
The FDR of 20% as threshold seems rather generous. Most epigenomic analyses apply FDR of 10% or 5% as the threshold. When using 10% as the threshold, only cg25422346 from maternal blood during pregnancy remains to be significant.

○
The analysis is based on complete-case data. It would be helpful to know if the subset that was included in the analysis is representative of the whole population in terms of their main characteristics and radon exposures. Based on the tables, it seems that the main reduction of sample size occurs between Model 1 and Model 2. One could also consider conducting multiple imputations for the covariates used for adjustment in the model to minimize the loss of power.

Conclusions:
With the exception of cg25422346, the rest of the results have FDR larger than 10%. The authors did mention that this is an exploratory analysis. More caution on the relatively limited sample size can be emphasized.

○
While there are some interesting findings, none of the CpG sites overlap between any 2 time points, even when comparing the results of Age 7 and 15 of the offspring. One would expect that some methylation changes could persist over time. More discussions, either through the perspective of limited statistical power or biological interpretation, on the highly variable findings by time points would be warranted. Potential trans-generational effect: With the multigenerational data in ALSPAC, one of the most interesting possibilities is to investigate whether radon exposures confer any trans-generational effects on methylation. Two of the relevant citations based on animal models were provided above. It would be valuable to assess if the methylation aberration induced by radon exposures persist across generations.

1.
RESPONSE: Given the limited sample size we do not think it is possible to study this in detail using this sample; especially since these exploratory analyses have not highlighted methylation effects on the same CpG sites or genes for mothers and children (similar to across different timepoints for children only). However, in response to comment 1 above, we have now added the possibility of transgenerational effects in the manuscript.

Statistical Analysis:
The FDR of 20% as threshold seems rather generous. Most epigenomic analyses apply FDR of 10% or 5% as the threshold. When using 10% as the threshold, only cg25422346 from maternal blood during pregnancy remains to be significant.
○ RESPONSE: We appreciate that an FDR of 20% was more generous than the traditionally used 10% (or 5%), but believe that because of the exploratory nature of this study it was more important to minimise false negative findings over false positive findings; the latter will need to be confirmed in future independent studies. We have added this to the Statistical methods section: Because of the exploratory nature of this study, associations at false discovery rate (FDR) less than 20% calculated using the q method 33 are reported instead of a more traditional 10% threshold. One could also consider conducting multiple imputations for the covariates used for adjustment in the model to minimize the loss of power.
○ RESPONSE: We addressed the issue of multiple imputations in the Discussion session: "…we did not apply multiple imputation to increase sample size. In future, when feasible approaches have been developed, we plan to revisit these analyses."

Conclusions:
With the exception of cg25422346, the rest of the results have FDR larger than 10%. The authors did mention that this is an exploratory analysis. More caution on the relatively limited sample size can be emphasized.

RESPONSE:
We have now specifically added this as a limitation to our study in response to an earlier comment. While there are some interesting findings, none of the CpG sites overlap between any 2 time points, even when comparing the results of Age 7 and 15 of the offspring. One would expect that some methylation changes could persist over time. More discussions, either through the perspective of limited statistical power or biological interpretation, on the highly variable findings by time points would be warranted.

RESPONSE:
We have now specifically added this point to the first paragraph of the Discussion section: "However, none of these associations were observed at multiple time points." However, there is an array of possible explanations for this, including the ones put forward by the reviewer, and because this is the first study looking at these associations at this stage it is not yet possible to favour one explanation over another. As such, we decided not to add a detailed discussion of this. Overall, it is not immediately clear what biological insights are gained from these results. Some more in-depth discussion on the implications of these results would be helpful. For examples, some regions were hypermethylated and some were hypomethylated. Whether these findings align with the biological functions of those genetic regions should be discussed. If there are no explicit mechanism related to "Residential radon exposure has been associated with DNA repair gene polymorphisms in adults 16 and children 17 , with the latter study also reporting double-strand break repair gene polymorphisms.". Please, check in the cited articles what associations have been reported, and correct the statement accordingly. RESPONSE: Residential radon exposure has been associated with DNA-repair gene polymorphisms in adults 16 and children 17 , with the latter study also reporting double-strand break repair gene polymorphisms.
Has been changed to: "Residential radon exposure has been associated with DNA-repair gene polymorphisms in adults (XpG gene Asp1104His, ADPRT gene Val762Ala, and NBS1 gene Glu185Gln polymorphisms) 16, and partly replicated in children (XpD gene Lys751Gln, XpG gene Asp1104His and ADPRT gene Val762Ala polymorphisms) 17 , with the latter study also reporting double-strand break repair gene polymorphisms." Methods: 2 nd paragraph reports at which ages blood was collected from study subjects (children and mothers). The information for children is unnecessary repeated in the subsection describing "DNA methylation".
RESPONSE: Thank you very much for highlighting this, we have removed this from the "DNA methylation" section.