A simple and cost-effective approach for technical validation of next generation methylation sequencing data

Background DNA methylation is a fundamental epigenetic process that, in most cases, modulates genetic expression levels. Changes in DNA methylation, either hypo- or hypermethylation, have a key role in many biological processes and several human diseases such as cancer. In the current study, we offered an approach to validate the next generation methylation sequencing data. Methods Genomic DNA was extracted from target and control samples (6 in each group), followed by bisulfite conversion. Next generation methylation sequencing and methylation sensitive high-resolution melting assay were carried out. The primers for methylation sequencing validation were designed by R programming language. Results In the current study, two groups, case and control, were discriminated based on methylation sequencing results and the real time PCR-based results were in accordance with the next generation methylation sequencing. Discussion Methylation sensitive high-resolution melting validation assay is a simple and cost-effective method, which confirmed next generation methylation sequencing results.


Abstract
Background DNA methylation is a fundamental epigenetic process that, in most cases, modulates genetic expression levels. Changes in DNA methylation, either hypo-or hypermethylation, have a key role in many biological processes and several human diseases such as cancer. In the current study, we offered an approach to validate the next generation methylation sequencing data.
Methods Genomic DNA was extracted from target and control samples (6 in each group), followed by bisulfite conversion. Next generation methylation sequencing and methylation sensitive high-resolution melting assay were carried out. The primers for methylation sequencing validation were designed by R programming language.
Results In the current study, two groups, case and control, were discriminated based on methylation sequencing results and the real time PCR-based results were in accordance with the next generation methylation sequencing.
Discussion Methylation sensitive high-resolution melting validation assay is a simple and cost-effective method, which confirmed next generation methylation sequencing results.
Background DNA methylation is one of many fundamental epigenetic processes that modulates the expression levels of genes [1]. It functions as an annotation system for marking the genetic text, thus providing instruction as to how and when to read the information and control transcription. Molecularly speaking, DNA methylation is a covalent modification that occurs exclusively on cytosine nucleotides [2]. In vertebrates, it is characterized by the addition of a methyl or hydroxymethyl group on the C5 position of cytosine. These modifications are most commonly associated with CpG site, that being sites where a cytosine is immediately followed by a guanine in a 5' to 3' direction. Non-CpG methylation in a CHH and CHG context (where H = A, C or T) can be found in embryonic stem cells and in plants [3]. Changes in DNA methylation, either hypo-or hypermethylation, play a key role in many biological processes and several human pathologies [4]. DNA methylation is also associated to the process of gene imprinting which has been linked to syndromes such as Prader-Willi, Angelman, or Beckwith Wiedemann. In cancer cells, it is likely that a great deal of changes in tumor cells have potential epigenetic origins [2]. Abnormal DNA methylation in cancer cells may be classified into two categories: site specific CpG island promoter hypermethylation and global DNA hypomethylation [1]. With the use of highthroughput technologies when detecting cancer mutations, it has been demonstrated that many of the affected genes are involved in DNA methylation metabolism or in the control of chromatin structure. These results increase the possibility that the epigenetic state of tumors may reflect one of the many mutational consequences that occur in cancer development [5]. It should be noted that aberrant methylation appears to take place prior to transformation of cancer cells. Thus, aberrant methylation could potentially be used as an early detection approach [6]. In the past few years, the discovery of non-invasive cancer detection assays based on aberrant methylation states has increased in demand and interest [7]. In addition to high-throughput mutation detection techniques, other methods for methylation detection have emerged for methylation cancer study and methylation marker discovery. These techniques can be divided into three categories: 1) methylation content assay: high-performance capillary electrophoresis or highperformance liquid chromatography; 2) methylation pattern and profiling: restriction landmark genomic scanning, methylated CpG-island amplification, amplification of intermethylated sites; 3) candidate gene approach: methylation-sensitive restriction endonuclease-PCR/Southern, bisulfite sequencing, methylation-specific PCR, MethyLight technology [8].
In any methylation marker discovery or cancer study, it is critical to validate the methyl sequencing results conducted by next generation sequencing (NGS). Most commonly, generating NGS results requires the design of complementary nucleic acids as site-specific markers that target the genome. In many cases, the process of designing the primers and probes can be laborious and may not properly represent the aberrant methylation alterations.
In the current study, we offered a novel, cost-effective and simple approach to validate the NGS-based methylation results proposing the use of a methylation sensitive highresolution melting assay.

Sample collection and preparation
Cancerous tissue samples from patients with colorectal cancer (N=6) confirmed by two gastroenterology pathology experts were considered as target or case tissues (assigned as cases: T20, T31, T35, T45, T65, T67). Normal controls tissues (N=6) were from individuals who underwent colonoscopy screening that were negative for either adenomatous polyps or cancer through the entire colon (assigned as controls: N4, N7, N8, N10, N14, N16). Demographic characteristics, colonoscopy reports, history of drug intake, smoking, as well as medical history were all collected. The case and control groups were matched by their demographic features. The protocol was approved by Mashhad University of Medical Sciences (Grant number: 961906). Informed written consent had been obtained from all participants in this study and have the participants permission to get published.

Nucleic Acid Isolation from Tissue
Genomic DNA was extracted from 5-25 mg of fresh tissues using QIAamp® Fast DNA Tissue Kit (Qiagen, Valencia, CA, USA) following the manufacturers instruction. The The Human Methyl-seq analysis was composed of three steps in the preprocessing stage before detecting differentially methylated regions. Firstly, the total reads were assessed by Quality Control tool [10]. Secondly, the raw sequencing reads were cleaned by trim galore [11]. Thirdly, the raw bisulfite sequencing data were converted into a number of methylated reads and covered reads of cytosines (including unmethylated/methylated reads) by aligning them to the human reference genome (GRCh37/19) using the Bismark [12].

Target Identification and Primer Design
We presented a primer design for methylation sequencing validation by R programming language. Methylation independent primers were designed based on the following criteria: 1.
Only one CpG site is located in this length.

3.
Coverage is more than the average depth per CpG sites in data.
The framework of our method is shown in Algorithm 1. . HRM analyses were performed at the temperature ramping from 65 to 97 °C.
Fluorescence acquisition setting was carried out at temperature recommended by the manufacturer. The melting curves and melting peak were normalized by calculation of the 'line of best fit' in between two normalization regions before and after the major fluorescence decrease representing the melting of the PCR product using the software version 1.1 provided with the LightCycler® 96 System.

Statistical analysis
To compare characteristics of the different groups of target and control samples, two sample t test was performed. The p value less than 0.05 was considered statistically significant.

Results And Discussion
In the current study, two groups, case (T20, T31, T35, T45, T65, T67) and control (N4, N7, N8, N10, N14, N16) were discriminated based on methylation sequencing results. Their NGS features are indicated in Table 1. In order to validate the next generation methylation sequencing results, two A and B primer sets, were used to target two different regions on the bisulfite-modified DNA. The primers designed by R programming where shown in figure   1.
Methylation sensitive high-resolution melting assay was conducted with LightCycler® 96 System and their results were represented in figure 2. The real time PCR results were in accordance with the next generation methylation sequencing. There was a significantly