DNA methylation and hydroxymethylation analysis using a high throughput and low bias direct injection mass spectrometry platform

DNA modifications are small covalent chemical groups that modify nucleotides to regulate DNA readout. Anomalous abundance and genome-wide localization of these modifications can negatively tune gene expression and propagate into unbalanced epigenetics regulation, which is known to be associated with multiple conditions such as cancer, diabetes and aging. We present a direct injection mass spectrometry (DI-MS) platform that offers fast, accurate and precise quantitation of global levels of DNA cytidine methylation (mC) and hydroxymethylation (hmC) in less than one minute per sample. On the contrary to most methods adopting mass spectrometry for the analysis of nucleotide modifications, in this DI-MS approach we eliminate the use of liquid chromatography, increasing throughput, eliminating issues of carryover and batch effects caused by column contamination across samples. In addition, potential biases in detection efficiency of modified nucleotides with different binding efficiency to stationary phases is eliminated, as no chromatographic separation is adopted. This method can analyze >1000 samples per day, overcoming the throughput of next-generation sequencing.• Direct injection mass spectrometry improves throughput and precision compared to liquid chromatography.• Direct injection can be used to quantify in less than one minute global levels of DNA methylation and hydroxymethylation.• The unbiased acquisition can be potentially utilized to analyze other nucleotide modifications.

• The unbiased acquisition can be potentially utilized to analyze other nucleotide modifications.

Method details
The direct injection mass spectrometry (DI-MS) method we developed is aiming to perform a rapid and accurate detection and quantification of DNA methylation (mC) and hydroxymethylation (hmC) from extracted and digested DNA [1] . This method is complementary for investigating chromatin modifications to the previously developed DI-MS approach for histone modification analysis [ 2 , 3 ]. We foresee its application will be beneficial in both basic science research and the clinical area, as it is a simple and robust approach to analyze potentially thousands of samples per day. The system we present consists of an Advion TriVersa NanoMate connected online to a Thermo Scientific Orbitrap Fusion Lumos, although it is suitable for other models and types of mass spectrometers. We present the details of the optimized sample preparation, analyte fragmentation and instrument parameters that proved to achieve the highest sensitivity, precision and accuracy in measuring mC and hmC [1] . The DI-MS method is also sensitive in detecting the formation of nucleoside byproducts, which are undetected and missed in targeted methods thereby affecting quantitation. We demonstrate the high throughput capability of the presented DI-MS approach by analyzing 81 samples in about 1.5 h [1] . Calibration Standards: A set of three 897 bp dsDNA standards containing either unmodifiedcytosine (cat # D5405-1), 5-methylcytosine (5 mC) (cat # D5405-2) or 5-hydroxymethylcytosine (5hmC) (cat # D5405-3) was purchased from Zymo Research.

DNA extraction and RNA removal
For the extraction of the DNA from HepG2/C3A cells we used the DNeasy Blood & Tissue kit (Qiagen).
1 One million cells were centrifuged for 5 min at 300 x g. 2 The supernatant was discarded, and the pellet was resuspended in 200 μL of PBS. 3 Subsequently, 20 μL of proteinase K were added and mixed by vortexing. 4 Samples were incubated with 4 μL of RNase A (100 mg/mL) at room temperature for 2 min to obtain RNA-free genomic DNA. 5 200 μL of Buffer AL were added, samples were mixed immediately by vortexing and incubated at 56 o C for 10 min using a Thermomixer R (Eppendorf). 6 200 μL of ethanol (100%) were added to the samples and mixed by vortexing until the solution was homogeneous. 7 The mixture was then transferred into the DNeasy Mini spin column and placed in a 2 mL collection tube 8 Tubes were centrifuged for 1 min at 60 0 0 x g (or until all the volume has passed through the column. The flow-through and collection tubes were discarded. 9 The spin columns were placed in new 2 mL collection tubes and 500 μL of Buffer AW1 were added in the center of the column. 10 Tubes were centrifuged for 1 min at 60 0 0 x g (or until all the volume has passed through the column). The flow-through and collection tubes were discarded. 11 The spin columns were placed in new 2 mL collection tubes and 500 μL of Buffer AW2 were added in the center of the column. 12 Tubes were centrifuged for 3 min at 20,0 0 0 x g to dry the DNeasy membrane. The flow-through and collection tubes were discarded. 13 Each column was transferred to a new 1.5 mL tube and 200 μL of Buffer AE were added in the center of the column. 14 Columns were incubated at room temperature for 1 min and then centrifuged for 1 min at 60 0 0 x g to elute the DNA. 15 The concentration of total DNA was determined using a NanoDrop ND10 0 0 instrument (Thermo Fisher). 16 Samples were then stored at -20 o C or immediately used for DNA hydrolysis.

DNA hydrolysis
For the hydrolysis of the DNA, we used the Nucleoside Digestion Mix (New England BioLabs).
1 500 ng of extracted DNA was mixed with 2 μL of Nucleoside Digestion Mix Reaction Buffer (10X), 1 μL of Nucleoside Digestion Mix and water to a final volume of 20 μL. 2 The mixture was incubated at 37 o C for 1 h using a Thermomixer R (Eppendorf). 3 Samples were then stored at -20 o C or immediately desalted for DI-MS analysis.
Incomplete digestion can be determined and analyzed through LC-MS. It was reported in our previously published article [1] that extracted C chromatogram would have more than one peaks at different retention times indicating nucleoside dimers or oligomer were existed in incomplete digesting DNA.

Sample desalting
All samples described in the manuscript were desalted before being direct injected into the MS using HyperSep TM Hypercarb TM SPE 96-well plates. Sorvall Legend XTR Refrigerated Centrifuge was used to spin out washing solution and eluent at 500 x g for 3 min. For high throughput injection, that value can be set as low as just a few seconds; the mass spectrometer only requires one acquisition cycle to obtain all the required spectra. The middle left panel is where the electrospray voltage is set up. We utilized a value that showed robust spraying across samples, to minimize the incidence of lack of ionization during the batch. The bottom left panel can be activated to change chip nozzle in case the spray current is insufficient to provide proper ionization. For high throughput experiments, that setting is unnecessary, as by the time that the instrument changes nozzle the acquisition time in the mass spectrometer is already finished. On the right, optional settings that do not affect the analysis either way. The washing solution was discarded and a new 96-well PCR microplate (Axygen) was placed as a collecting plate. 4 Nucleosides were eluted with 100 μL of buffer containing 60% ACN and 0.1% FA. 5 Sample plate was dried in a vacuum centrifuge. 6 The dried nucleosides were stored in a -20 °C freezer and they were resuspended in 70% ACN right before DI-MS analysis.

Sample analysis via direct injection mass spectrometry (DI-MS)
DI-MS analysis was performed with a TriVersa NanoMate (Advion) coupled online with the Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific). The NanoMate was programmed to pick up 5 μL of solution followed by 0.5 μL of air gap to avoid spilling. Samples were sprayed into the mass spectrometer using a gas pressure of 0.3 psi and a positive voltage set at 1.7 kV with a time range of 10 min. Contact closure to start MS acquisition was set at 2.5 s after engaging the probe to the instrument chip nozzle. This implies that the NanoMate was programmed to spray for a few seconds prior starting the acquisition of the mass spectrometer; this time gap allows the current to stabilize and minimizes inaccurate readouts. NanoMate settings are shown in Fig. 1 . The acquisition settings for the mass spectrometer were as follows: 30 V for source fragmentation energy, 50% for radio frequency (RF) lens, 2.3 kV for spray voltage and 275 °C for the heated capillary. The setting of 30 V source fragmentation energy is to break all different forms of C (monomer, dimer, sodium adducts, etc.) into its nucleoside form cytosine which will benefit quantification. The setting of 50% RF lens will help maximize the intensity of C and benefit the sensitivity if this method. More details can be found in our previously published ariticle [1] .
The automatic gain control (AGC) target was set to 1.0e6 and the maximum injection time was 100 ms. Notably, the injection time can be set using very high values (even higher than 1 second) during direct injection analysis. A long scan time allows to increase the sensitivity, as the instrument has more time to accumulate the desired ions. A rapid scan rate is instead essential during chromatographic separation, as the molecule elutes only in a specific time range of the gradient. The full scan range was initially set to 110-600 m/z and the resolution of the Orbitrap was set at 120,0 0 0. This scan range was utilized to include protonated nucleobases of cytosine (C), methylcytosine (mC) and hydroxymethylcytosine (hmC) separated from the deoxyribose, the intact nucleosides and potential dimer formations. By using the 30 V source fragmentation energy specified, the totality of signals should be almost exclusively the nucleobase ( > 90 %) [1] , corresponding to the m/z at 112.0505 (C), 126.0 6 62 (mC) and 142.0611 (hmC) ( Fig. 2 A). In the low mass range, it is also possible to observe the nucleobases of A, T and G, indicating that possible modifications on other nucleosides can also be quantified with our approach. To achieve higher sensitivity of the analysis, we also performed scans using very narrow windows isolating each individual analyte ( Fig. 2 B-D). For instruments that can accumulate ion prior scanning them (ion trap, orbitrap trapping ion mobilities), we recommend performing targeted scans as they achieve higher sensitivity due to the reduced accumulation of undesired signals. The acquisition time for each sample was set to 10-15 seconds and the resulting spectra were averaged prior extracting the signal intensity for each analyte.

Data analysis
The Xcalibur (Thermo Scientific) software was used for DI-MS data analysis. The extraction of the signal intensity was performed by manually copying and pasting the spectrum list into a spreadsheet. The global level of mC and hmC are calculated using the following equations: The results got here is what we called observed ratio, which need conversion to the actual ratio by using the calibration curve shown in Fig. 4 . Notably, the denominator should account also for % of formylC (5fC) and carboxyC (5caC), although it is unnecessary when these modifications are below the limit of detection.

Method validation
The newly developed DI-MS method was validated by re-injecting the same samples using liquid chromatography coupled online with mass spectrometry (LC-MS) [1] , a method more commonly utilized in literature [4][5][6][7][8][9][10] . We demonstrated using 13 different cell lines that DNA methylation was quantified with high reproducibility by both DI-MS and LC-MS ( Fig. 3 A). DI-MS provided comparable, and occasionally smaller, variance than LC-MS. This suggests that the method is not only faster, but also more precise. The quantification of hydroxymethylation was not as reproducible between the two approaches ( Fig. 3 B), as hmC was not detectable in most cell lines using LC-MS. In kidney cell models, LC-MS showed higher sensitivity than DI-MS, although the variability detectable was remarkably higher as shown by the error bars. This discrepancy in detection is described in Sun et al. as the result of the poor retention of hmC to C18 chromatography, making it easier to quantify by DI-MS [1] .
As additional validation, we compared the measurement of mC and hmC in the given cell lines with results from other publications. The total 5mC and 5hmC levels on the DNA of the human kidney cell line 293T were previously measured and published [11] . Wahba et al. showed them to be 3.9% and 0.02% respectively, very close to the values of 4.17% and 0.02% determined by our DI-MS approach. Another recently published study [12] reported global methylation levels of HepG2 and HeLa to be ∼1.6% and ∼3.3% respectively, corresponding to a ratio of approximately 1:2. Our data provided 2.6% and 4.7%, a ratio of approximately 1:1.8. Insect cell lines were included in the analysis cohort and provided a negative control for our analysis, as insect DNA is commonly hypomethylated. The published bisulfite sequencing research [13] of the insect orders of Diptera and Lepidoptera reported methylation levels < 1%. Our DI-MS analysis data was consistent with bisulfite sequencing, detecting little to no methylation for both organisms.
The accuracy of DI-MS was confirmed by estimating the response linearity by mixing standards of fully unmodified or modified oligonucleotides at different ratios. Specifically, we mixed oligonucleotides with all cytidines methylated with oligonucleotides with only unmodified cytidine over a range of 1% to 7%. Similarly, hmC modified oligonucleotides were mixed with unmodified ones over a range of 0.13% to 1.75%. The analysis was performed with both DI-MS ( Fig. 4 A and B) and LC-MS ( Fig. 4 A and B). Both methods provided a high Pearson correlation for both analytes, i.e. R 2 = 0.99. However, DI-MS showed a curve slope closer to 1 than LC-MS when comparing observed vs expected ratio between the modified and the unmodified nucleoside. This highlighted the low detection bias of differently modified nucleosides by DI-MS, while LC-MS had issues in obtaining the proper response to hmC signal intensity due to its limited binding to the chromatographic column [1] .
Finally, we reported the estimation of the limit of detection (LOD) and limit of quantification (LOQ) for both mC and hmC. The LODs were estimated to be approximately 0.001 fmol for both mC and hmC, and the LOQs were estimated as 2.1 fmol and 0.2 fmol for mC and hmC respectively. These values were comparable with existing methods [14][15][16] as illustrated in details in the Supporting Information of Sun et al. [1] .

Conclusions
Our newly developed DI-MS platform proved to be able to perform relative quantification of global levels of DNA methylation (mC) and hydroxymethylation (hmC) in a rapid, sensitive and unbiased manner. We demonstrated that our approach provides data comparable to existing literature and with the golden standard LC-MS. In the referenced publication, we also performed a durability test by analyzing 81 samples in about 1.5 h [1] , which makes this platform have huge potential that can analyze thousands of samples per day. By bypassing chromatographic separation from LC-MS, potential issues of carryover, batch effects and polarity of analyte can be avoided which makes DI-MS an intrinsically more robust method.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.