Assessment of dried blood spots for DNA methylation profiling

Background: DNA methylation reflects health-related environmental exposures and genetic risk, providing insights into aetiological mechanisms and potentially predicting disease onset, progression and treatment response. An increasingly recognised need for large-scale, longitudinally-profiled samples collected world-wide has made the development of efficient and straightforward sample collection and storage procedures a pressing issue. An alternative to the low-temperature storage of EDTA tubes of venous blood samples, which are frequently the source of the DNA used in such studies, is to collect and store at room temperature blood samples using purpose built filter paper, such as Whatman FTA® cards. Our goal was to determine whether DNA stored in this manner can be used to generate DNA methylation profiles comparable to those generated using blood samples frozen in EDTA tubes. Methods: DNA methylation profiles were obtained from matched EDTA tube and Whatman FTA® card whole-blood samples from 62 Generation Scotland: Scottish Family Health Study participants using the Infinium HumanMethylation450 BeadChip. Multiple quality control procedures were implemented, the relationship between the two sample types assessed, and epigenome-wide association studies (EWASs) performed for smoking status, age and the interaction between these variables and sample storage method. Results: Dried blood spot (DBS) DNA methylation profiles were of good quality and DNA methylation profiles from matched DBS and EDTA tube samples were highly correlated (mean r = 0.991) and could distinguish between participants. EWASs replicated established associations for smoking and age, with no evidence for moderation by storage method. Conclusions: Our results support the use of Whatman FTA® cards for collecting and storing blood samples for DNA methylation profiling. This approach is likely to be particularly beneficial for large-scale studies and those carried out in areas where freezer access is limited. Furthermore, our results will inform consideration of the use of newborn heel prick DBSs for research use.


Introduction
Recent technological advances have facilitated the highthroughput genome-wide measurement of DNA methylation, yielding many studies reporting links between methylation and several conditions, diseases and phenotypes. The majority of studies have profiled DNA methylation in peripheral blood due to ease of sampling, which also renders it a useful tissue for identifying biomarkers. Blood samples for DNA methylation profiling are typically frozen prior to DNA extraction, with some samples being stored for years before use. As the sample numbers used for studies of DNA methylation, particularly epigenome-wide association studies (EWASs), continue to grow and the value of repeat sampling over time and exposures is recognised, so the logistical and financial issues surrounding whole blood sample collection and storage escalate. Moreover, population differences in DNA methylation [1][2][3][4] highlight the need to profile DNA methylation around the world, including in low-and middle-income countries and rural settings, which might lack the required facilities for the low-temperature storage of EDTA blood collection tubes.
A potential alternative that would permit field collection and storage as well as postal collection is to collect finger or heel prick blood samples on filter paper 5 . As part of Generation Scotland: Scottish Family Health Study (GS:SFHS) 6,7 , Whatman FTA ® cards were used to spot 100μl of the peripheral (venous) blood samples obtained at the baseline research clinic appointment, which took place between 2006 and 2011. These cards contain chemicals that lyse cell membranes and denature proteins, thus protecting DNA from degradation. Samples collected on these cards can be stored at room temperature for several years 8 prior to DNA extraction, thus avoiding the financial and logistical issues associated with freezer storage. Whatman FTA ® cards have been validated for genetic studies 9-11 ; however, the effect of storing samples in this manner on the epigenome is less well-established. Previous studies provide support for the possibility of profiling DNA methylation in samples stored on Whatman FTA ® cards 8,12 ; however, these studies were limited by their focus on either a small number of individuals (n = 2) 12 or a single gene of interest 8 and a relatively short time interval between sample collection and DNA extraction (maximum of approximately two years). Moreover, to the best of our knowledge, no previous study has compared genome-wide DNA methylation profiles obtained from dried blood spots (DBSs) stored on Whatman FTA ® cards with those from DNA from matched whole blood samples frozen in EDTA tubes (henceforth referred to as "EDTA samples").
The matched DBSs and EDTA samples obtained from GS: SFHS 6,7 have permitted us to carry out this comparison using samples from 62 participants. We show that it is technically feasible to measure DNA methylation from DBSs stored on Whatman FTA ® cards, that the methylation profiles of DBS samples are highly correlated with those from EDTA samples from the same individuals, and that EWASs using DBS samples replicate well-established differentially methylated loci associated with smoking and age. These findings indicate that this storage method represents a viable alternative to freezing blood samples for DNA methylation studies.
In addition to supporting the use of DBSs as a solution to the challenges posed by the need to collect and store large numbers of blood samples for DNA methylation studies in a wide-range of sociodemographic settings, our results also provide encouragement for the ability to profile DNA methylation in DBSs routinely collected on filter paper (commonly referred to as Guthrie cards) from new-born babies. The Guthrie cards are used clinically to screen for a prescribed set of metabolic and genetic disorders. These tests do not exhaust the sample and the remaining material is typically stored, often indefinitely. The ability to profile DNA methylation from Guthrie cards would present a number of important opportunities 13,14 , including: addressing the issues of reverse causation and confounding, which plague EWASs; anchoring population studies of gene-environment interactions at the beginning of life; and permitting the development of predictors of future disease onset, thus facilitating early intervention.
In Denmark, research access to Guthrie cards is possible through the Danish Newborn Screening Biobank 15 and in 2017 it was estimated that almost 40% of the 2.1 million samples stored in this biobank had contributed to published research 16 . In Scotland, however, a moratorium on research access to Guthrie cards was imposed in 2009 and remains in place pending a public consultation 17 . An important step in determining whether medical researchers will be permitted access to this resource (over 2.5 million cards collected since 1965, including 9,788 GS:SFHS participants with genome-wide genotype data and extensive phenotype data) will be assessments of the feasibility of obtaining biologically meaningful and clinically valuable data from archived DBSs. Although the Whatman FTA ® card is not identical to the card types used in the NHS Scotland new-born screening programme, some of the issues addressed in the present study, such as low DNA yield, a long period of storage, and the necessity of using a sample comprising heterogeneous cell types, are relevant to the potential use of Guthrie cards for research.

Cohort description
Genome-wide DNA methylation was profiled in whole blood samples from 62 GS:SFHS participants, chosen to have matched DNA samples from whole blood stored in EDTA tubes, DBSs stored on Whatman FTA ® cards and complete phenotype data (on sex, age and smoking status). GS:SFHS has been described in detail elsewhere 6,7 ; further information and information on research access can be found here. Studies arising from GS:SFHS can be found here. Briefly, GS:SFHS comprises ~24,000 individuals who were aged 18 years or over at the time of recruitment. At the time of blood sample collection, participants were deeply phenotyped for a range of demographic, health, and social variables. A favourable ethical opinion for GS:SFHS was obtained from the NHS Tayside Committee on Medical Research Ethics, on behalf of the National Health Service (reference: 05/S1401/89). GS:SFHS has Research Tissue Bank Status (reference: 15/ES/0040).
Blood sample collection and DNA extraction Each participant provided a blood sample (9ml), which was collected in an EDTA tube for DNA extraction, when attending the baseline GS:SFHS clinical appointment. This sample was stored at -20°C for between 33 and 497 days. At this point, 100μl of the sample (approximately equivalent to two drops of blood) was blotted onto a Whatman FTA ® classic card (GE Healthcare; cat. no. WB120205) and DNA was extracted from the remaining sample using the Nucleon BACC3 Genomic DNA Extraction Kit (GE Healthcare; cat. no. RPN8512) 18 , following the manufacturer's instructions. The Whatman FTA ® cards were stored at room temperature for between 5 and 10 years until DNA extraction for this project. DNA was extracted from the entire blood spot using the QIAamp DNA Investigator Kit (Qiagen; cat. no. 56504), following the manufacturer's instructions. The concentration of the DNA samples was measured using a Qubit 2.0 fluorometer and the Qubit dsDNA HS assay (Thermo Fisher; cat. no. Q32854). Sample storage, DNA extraction and DNA quantification took place at the Edinburgh Clinical Research Facility at the University of Edinburgh.
Genome-wide methylation profiling Whole blood genomic DNA (EDTA samples: 500ng; DBS samples: 212-500ng) was treated with sodium bisulphite using the EZ-96 DNA Methylation Kit (Zymo Research; cat. no. D5004), following the manufacturer's instructions. DNA methylation was profiled using the Infinium HumanMethylation450 BeadChip (Illumina Inc.; cat. no. WG-314-1001), according to the manufacturer's protocol. Paired EDTA and DBS samples from a given participant were profiled on separate slides and matched for row position.
Raw intensity (.idat) files were read into R 19 v. 3.5.1 using minfi 20 v. 1.28.0 and quality control was carried out using shinyMethyl 21 v. 1.18.0 and functions within wateRmelon 22 v. 1.26.0. ShinyMethyl was used to create a "QC plot", which plots the log median intensity of the unmethylated signal against the log median intensity of the methylated signal for each array, and density plots of the beta-values separated by probe type (type 1 or 2) and colour channel for the type 1 probes. These plots were assessed for outlier samples by visual inspection. The performance of each of the 14 types of control probe on the array was assessed by inspecting plots created by shinyMethyl showing the raw signal intensities for each array. Outliers were assessed by visual inspection. ShinyMethyl's sex prediction plot was used to assess whether a participant's predicted sex based on the difference between the median copy number intensity for the Y chromosome and the median copy number intensity for the X chromosome matched their self-reported sex. Next, the pfilter function in wateRmelon was used to exclude poor-performing samples and probes. Samples with ≥ 1% sites with a detection p-value of > 0.05 were excluded. Probes were excluded from the dataset if: (i) ≥5% samples had a beadcount of < 3; or (ii) ≥ 0.5% samples had a detection p-value of > 0.05. Probes that had been predicted to cross-hybridise were removed using the rmSNPandCH function in the R package DMRcate 23 v. 1.18.0, which removes cross-reactive probes identified by Chen et al. 24 . Probes on the X and Y chromosomes and those that target loci affected by a SNP at the CpG site or at the site of single base extension (in the case of Type 1 probes) were removed prior to carrying out EWASs.
The data from all samples were normalised together using the dasen method in wateRmelon. Dasen adjusts the background difference between Type I and Type II assays (by adding the offset between Type I and II probe intensities to Type I intensities) and then performs between-array quantile normalisation for the methylated and unmethylated signal intensities separately (Type I and Type II assays normalised separately).

Assessment of the relationship between detection p-value
and DBS sample characteristics A paired t-test was performed using R 19 v. 3.5.1 to determine whether there was a difference in the total number of sites per sample with a poor detection p-value (detection p > 0.05) between the DBS and EDTA samples, with p ≤ 0.05 deemed to be significant. The relationships between the total number of sites for each sample with a detection p-value greater than 0.05 and (i) sample storage time and (ii) quantity of DNA hybridised to the array were assessed by linear modelling, with p ≤ 0.05 deemed to be significant.

Estimation of whole blood cellular composition
Estimated cell counts for B-lymphocytes, granulocytes, monocytes, natural killer cells, CD4+ T-lymphocytes and CD8+ T-lymphocytes were obtained using the estimateCellCounts function in minfi. This function implements Jaffe and Irizarry's 25 modification of Houseman's 26 algorithm. Paired t-tests were performed to assess whether blood storage method (EDTA tube vs. Whatman FTA ® card) affected cellular composition. Correction for multiple testing was implemented using the Benjamini-Hochberg false discovery rate (FDR), with q-values of ≤ 0.05 deemed to be significant.
Assessment of the correlation between methylation profiles obtained from EDTA tube and DBS paired samples and unsupervised hierarchical clustering The pair-wise correlation between all samples was assessed by calculating Pearson correlation coefficients. Unsupervised hierarchical clustering plots were generated using the R package lumi 28 v. 2.34.0 using the same set of variable probes as used for the correlation analysis.
Identification of differentially methylated positions associated with smoking or age and the interaction with storage method EWASs were carried out (i) to identify loci where DNA methylation is associated with smoking or age in the EDTA or DBS samples and (ii) to establish whether these associations were moderated by storage method. EWASs for smoking and age were implemented in the R package limma 29 v. 3.38.3 by fitting linear models in which DNA methylation (beta-value) was the outcome variable and age or smoking (a categorical variable with the following levels: current smoker, former smoker who gave up under 12 months ago, former smoker who gave up 12 months or longer ago, or never smoked) was the predictor of interest. When smoking status was the predictor of interest, the following covariates were included in the model: age, sex, slide ID, row number and estimated proportions of B-lymphocytes, granulocytes, natural killer cells, CD4+ T-lymphocytes and CD8+ T-lymphocytes. The same covariates were included for the EWAS of age with the addition of "pack years", a measure that indicates an individual's lifetime exposure to tobacco. Pack years were calculated by multiplying the years an individual had smoked for by the maximum number of packs of cigarettes they ever smoked per day (a pack = 20 cigarettes). A conversion was used for cigars (a cigar = four cigarettes) and rolling tobacco (a 25g pack = 50 cigarettes). EWASs to assess the interaction between smoking status or age and storage method (EDTA tube or Whatman FTA ® card) were carried out using linear mixed effects models implemented in the R package nlme 30 v. 3.1-137. The fixed effects were the same as for the main effects EWASs, with the addition of a variable representing storage method, and each participant was modelled with a random intercept. The threshold for significance was defined as p ≤ 3.6 × 10 -8 , as recommended by Saffari et al. (2018) 31 .

Extraction of DNA from DBSs stored on Whatman FTA ® cards
The concentration of the DNA extracted from the DBSs ranged from 3.8 ng/μl to 145.8 ng/μl, with a mean yield of 18.0 ng/μl, in a total volume of 50 μl. The recommended minimum input for the Infinium HumanMethylation450 BeadChip, 500 ng DNA, was available for 43 of the 62 samples, with all but two samples having at least 250 ng DNA as input. The lowest array input was 212 ng DNA.
Quality control of DNA methylation arrays Initial assessment of array performance using the R package shinyMethyl indicated that all arrays had performed well. Neither the QC plot ( Figure 1) nor the density plots of M-values and beta-values ( Figure 2) produced using shinyMethyl indicated any obvious outliers. Both plot types did, however, indicate a slight shift between the EDTA and DBS samples. From the QC plot it can be seen that there was a tendency for the DBS samples to have slightly lower methylated signal intensity (y axis) and a higher unmethylated signal intensity (x axis). From the density plots, a slight shift in distributions, which was most evident at the extremes of the beta-value range (i.e. 0 and 1), could be seen for type 1 probes in both the green ( Figure 2A) and red ( Figure 2B) colour channels and type 2 probes ( Figure 2C). This shift was most obvious for type 2 probes at the methylated end of the beta-value range, such that fewer sites in DBS samples showed the highest levels of DNA methylation. These differences in overall signal intensity were not, however, reflected by differences in the performance of the control probes. Plots of the output of these probes supported the conclusion that all arrays had performed well, with no obvious outliers (Figure 3). Importantly, the bisulphite conversion control probes did not indicate a difference in bisulphite conversion efficiency between EDTA and DBS samples and the non-polymorphic control probes, which measure overall assay performance, did not appear to differ according to sample storage method.
In keeping with these observations, no samples were excluded based on the proportion of poorly detected sites. There was, however, a significant difference between the DBS and EDTA blood samples in the number of sites per sample with a poor detection p-value (p = 0.0177), such that the DBS samples had fewer sites per sample with a poor detection p-value than the EDTA samples (mean difference = -63.7 sites/sample). For the DBS samples, the total number of sites per sample with a poor detection p-value was not related to storage time (p = 0.498) or the amount of DNA hybridised to the array (p = 0.220).
In total, 292 sites were excluded as their beadcount was lower than three in at least five percent of samples and 2,280 sites were excluded due to having at least one percent of samples with a detection p-value greater than 0.05. Following the exclusion of probes that have been predicted to cross-hybridise, 452,278 probes were retained in the dataset, with this number being reduced to 436,309 after excluding probes on the X and Y chromosomes and those affected by a SNP at the CpG site or the site of single base extension (for type 1 probes).

Participant and sample characteristics
The mean age of the participants was 31.8 years (range = 18.1-42.8 years; standard deviation = 7.2 years) and 62.9% of the  Assessment of the effect of storage method on whole blood cellular composition Sample storage method (EDTA vs. DBS) was not observed to be associated with the proportion of any of the six estimated blood cell types (min. q = 0.103; Table 1).
Assessment of the comparability of DNA methylation profiles obtained from DBS and EDTA tube-stored whole blood samples Using a subset of probes defined as showing variation in our sample (n = 190,044), Pearson correlation coefficients were calculated between all pairs of samples. For all participants, the highest correlation coefficient was observed between the two blood sample types (EDTA and DBS) from that participant (mean r = 0.991, range = 0.976 -0.995). For comparison, the pairwise correlation coefficients between pairs of samples from different participants (across both storage methods) had a mean of r = 0.953 and ranged from 0.904 -0.976. Consistent with these observations, unsupervised hierarchical clustering performed using the same variable set of probes indicated a high level of similarity between the DBS and EDTA samples from the same participant, with methylation differences being sufficient to distinguish between participants (Figure 4).
Assessment of the relationship between DNA methylation and Whatman FTA ® card storage time An EWAS was performed to establish whether the length of time that a sample is stored on a Whatman FTA ® card is associated with altered DNA methylation. This did not reveal any significant differences (min. p = 1.19 × 10 -5 ; summary statistics are available as Underlying data 32 ) Assessment of the effect of sample storage method on the relationship between DNA methylation and age and smoking Age and smoking have been robustly associated with altered DNA methylation by several previous studies. EWASs were  Table 1. Assessment of the relationship between blood cell type proportions and sample storage method. Six blood cell types were estimated using the estimateCellCounts function in the R package minfi and associations with method of blood sample storage (EDTA tube vs. Whatman FTA card) were assessed by paired t-tests. A Benjamini-Hochberg false discovery rate (FDR) adjustment (q-value) was applied to account for the number of tests carried out and statistical significance was defined as q ≤ 0.05.  carried out to (i) identify age-and smoking-associated differentially methylated positions (DMPs) in the EDTA and DBS samples when considered separately and (ii) assess whether sample storage method exerted an effect on these associations. EWASs for age and smoking status identified significant DMPs in both the EDTA and the DBS samples (Table 2-Table 5; complete summary statistics are available as Underlying data 32 ), while no significant associations were identified between DNA methylation and the interaction between storage method and age or smoking (min p = 5.57 × 10 -6 ; summary statistics are available as Underlying data 32 ).

Discussion
In agreement with previous studies, we have demonstrated that it is possible to obtain high quality DNA methylation profiles from archived DBSs stored on Whatman FTA ® cards and, for the first time, shown that these profiles are highly comparable to those obtained from matched whole blood DNA obtained from samples stored in EDTA tubes. Our findings support the use of Whatman FTA ® cards for the routine collection of blood samples for studies of DNA methylation. There are potentially substantial benefits to this approach, particularly for large-scale studies and those carried out in rural or remote settings and/or in low-and middle-income countries. The blood collection by finger prick protocol is comparable to that used for blood glucose monitoring in diabetes and could be self-administered by study participants.
DNA methylation profiles obtained from the DBSs appeared to be of good quality: on almost all comparison metrics, the performance of the DBS samples could not be distinguished from the performance of the EDTA tube samples. There was a tendency for DBS samples to have higher methylated signal intensity and lower unmethylated signal intensity than the EDTA samples, but this did not result in a difference in the performance of the control probes, or in the number of sites with a poor detection p-value. Encouragingly, the overall rate of site detection was slightly better in the DBS samples than the EDTA samples. These observations support and extend those of previous studies in which DBS DNA (both from Whatman FTA ® cards and other card types) has been profiled successfully using Infinium BeadArrays 12,33-37 .
Two concerns cited when considering the use of archived DBSs are the effects of sample quantity and degradation with storage time 14 . Here we demonstrate that despite obtaining less DNA than recommended as input for the Infinium HumanMeth-ylation450 BeadChip from 19 of the DBS samples, high quality methylation profiles were obtained. Moreover, we did not observe an effect of storage time on the success of methylation profiling; however, as the longest time that a sample was stored on a Whatman FTA ® card in the present study was 10 years, it remains possible that longer storage times might exert an effect. An earlier demonstration of successful methylation profiling of two DBS samples that had been stored for 26-28 years on standard filter paper 35 does, however, provide additional support for the notion that extended storage time (as is the case for some potentially valuable archival sample collections) might not exert a deleterious effect.
Correlation and unsupervised hierarchical clustering analyses demonstrated a high degree of concordance between the DNA   A disadvantage to storing blood samples using Whatman FTA ® cards is that it is not possible to sort the sample into separate blood cell types. As DNA methylation varies between blood cell types, the issue of cellular heterogeneity must be addressed in order to avoid confounding in EWASs using whole blood DNA 25,26 . Fortunately, several algorithms have been developed previously to deal with the issue of cellular heterogeneity in EWAS studies (see Kaushal et al., 2017 48 for a comparison of these methods) and our data supports the applicability of these to blood samples stored using Whatman FTA ® cards.
Taken together, the results of this study support the use of Whatman FTA ® cards as an alternative, more cost-and spaceeffective storage method for blood samples collected for DNA methylation studies. We demonstrate that DNA from DBSs stored for up to 10 years can be successfully profiled using Infinium BeadChips, producing methylation profiles that are highly correlated with matched EDTA samples. We envisage that this approach to collecting and storing blood samples for DNA methylation profiling will help circumvent many issues faced by researchers amassing large collections of blood samples, particularly where access to freezer facilities is limited. These findings are also relevant to the ongoing discussion regarding the potential use of new-born Guthrie cards for research purposes and supports previous work assessing the potential for methylomic profiling of these samples 34-36,49,50 .

Data availability
Underlying data Due to the confidential and potentially identifiable nature of the DNA methylation data and participant phenotypic information, it is not possible to publically share the individual level  The manuscript presents evidence to support the usage of aged dried blood spots as an adequate substitute for fresh or frozen EDTA blood samples. As dried blood spots are easier to collect and cheaper to store than fresh blood, and are stored routinely at birth in most countries, the knowledge generated is quite useful and adds a little novelty to the literature. The manuscript is generally well written, easy to follow and scientifically sound. However, it is not as novel as it claims.
Firstly, the authors state that their study is the first methylome-wide study to compare fresh and archived dried blood spots. This is not the case; the following studies have indeed done something the same or very similar: Aberg KA et al, High quality methylome-wide investigations through next-generation sequencing of DNA from a single archived dry blood spot . 1