Methylation status of individual CpG sites within Alu elements in the human genome and Alu hypomethylation in gastric carcinomas

Background Alu methylation is correlated with the overall level of DNA methylation and recombination activity of the genome. However, the maintenance and methylation status of each CpG site within Alu elements (Alu) and its methylation status have not well characterized. This information is useful for understanding natural status of Alu in the genome and helpful for developing an optimal assay to quantify Alu hypomethylation. Methods Bisulfite clone sequencing was carried out in 14 human gastric samples initially. A Cac8I COBRA-DHPLC assay was developed to detect methylated-Alu proportion in cell lines and 48 paired gastric carcinomas and 55 gastritis samples. DHPLC data were statistically interpreted using SPSS version 16.0. Results From the results of 427 Alu bisulfite clone sequences, we found that only 27.2% of CpG sites within Alu elements were preserved (4.6 of 17 analyzed CpGs, A ~ Q) and that 86.6% of remaining-CpGs were methylated. Deamination was the main reason for low preservation of methylation targets. A high correlation coefficient of methylation was observed between Alu clones and CpG site J (0.963), A (0.950), H (0.946), D (0.945). Comethylation of the sites H and J were used as an indicator of the proportion of methylated-Alu in a Cac8I COBRA-DHPLC assay. Validation studies showed that hypermethylation or hypomethylation of Alu elements in human cell lines could be detected sensitively by the assay after treatment with 5-aza-dC and M.SssI, respectively. The proportion of methylated-Alu copies in gastric carcinomas (3.01%) was significantly lower than that in the corresponding normal samples (3.19%) and gastritis biopsies (3.23%). Conclusions Most Alu CpG sites are deaminated in the genome. 27% of Alu CpG sites represented in our amplification products. 87% of the remaining CpG sites are methylated. Alu hypomethylation in primary gastric carcinomas could be detected with the Cac8I COBRA-DHPLC assay quantitatively.


Background
The Alu element is a member of the SINE family of repetitive elements. It is an example of a non-automatic retrotransposon. It is the most abundant gene in the human genome (more than one million copies per haploid genome), representing 10% of the genome mass [1]. Alu elements are mainly distributed in gene-rich regions. About 75% of gene promoters in the genome contain Alu elements [2].
A consensus Alu element usually contains 24 CpG sites ( Figure 1) [3]. In fact, the CpGs within Alu elements harbour up to one-third of the total CpG sites in the genome [4,5]. In normal tissues most Alu elements are methylated and transcriptionally inactive. However, stress-induced demethylation of these CpGs could reactivate Alu transcription. Although Alu transcripts encode no protein, they can regulate expression of other genes, affect recombination, and influence patterns of nucleosome formation and evolution of the genome [6][7][8][9]. Demethylation of Alu elements is an indicator of lower genome stability, which is necessary for gene recombination and chromosome translocation [10]. Retrotranscriptase encoded by LINE-1 helps retrotranscription and transposition of Alu elements into the genome [11]. Total methylation content of Alu elements and LINE-1 sequences is highly correlated with global DNA methylation content [12]. Estimation of total methylation content of Alu elements is useful for evaluation of the global genomic methylation status and level of homologous and non-homologous chromatin recombination in gene-rich regions.
Deamination and other variations of CpGs within Alu elements happen frequently during evolution. This has led to diversification of Alu into 213 subfamilies [13]. However, the maintenance and methylation status at each CpG site within Alu has not been characterized experimentally. Several methods such as the combined bisulfite restriction assay (COBRA), pyrosequencing, MethyLight and unmethylated Alu-specific amplification have been developed in the past few years [14][15][16]. In those assays, methylation content of a specific CpG site (s) within Alu elements was detected and used to represent the methylation/demethylation level of all Alu elements. However, representativity and consistency of methylation status of these used CpG sites have not been investigated. Additionally it is not known how representative or reproducible are the results for these previously studied CpG sites. In the present study, initially we analyzed the maintenance status and variations, including deamination, at each CpG site, in human cancer and normal tissues. Finally, the methylation status of each CpG site was then evaluated based upon extensive  [15]) and highlighted in the colour pink. The arrowed-line point to the recognition sequence (including the sites H and J) of restriction enzyme Cac8I; Rectangles, primer matching sequences; The arrowed-boxes, primer F-1, F-2, and R-1 were used to amplify the bisulfite-converted templates, and F-w and R-w were used to amplify the templates without bisulfite treatment. The primer F-2 and R-1 were the same as described (Ref. [14]). The sites A-B-C were included within the probe sequence of MethyLight (Ref. [15]). The sites H-I-J were the target CpGs in pyrosequencing assay and the site P was the target CpG in MboI-COBRA (Ref. [14]). The site M was the target CpG used for quantification of unmethylated Alu (Ref. [16]). The capital letter T in the colour pink was resulted from evolutionary deamination of cytosine. The underlined CpG and TpA represent the methylated CpG site and antisense-deaminated CpG site, respectively. bisulfite clone sequencings. Based upon the analysis of the correlation between the number of methylated CpGs and the methylation status of each CpG site within Alu elements, a novel convenient COBRA-DHPLC assay was developed to quantify changes of the proportion of methylated-Alu elements in human gastric carcinomas successfully.

Human gastric mucosa samples
Forty-eight pairs of primary gastric carcinoma (GC) surgical tissues and their corresponding normal (GC-Nor) samples, 55 gastric biopsies from patients with or without gastritis, were collected from inpatients and outpatients at Beijing Cancer Hospital, respectively (male/ female sex ratio, 7/3; 40-81 years old, the average age 60-y). All these specimens were freshly frozen at -70°C. The Institutional Review Boards of Peking University School of Oncology approved the study and all patients gave written informed consent.

Cell lines and treatment of 5-aza-dC
Human carcinoma cell lines AGS and SW480 were cultured in F12 and DMEM medium (GIBCO) supplemented with 10% FBS at 37°C, 5% CO 2 . These cell lines were treated with 10 μmol/L (final concentration) of 5aza-dC (Sigma) or an equal volume of PBS (pH7.4, 5 μl/ well with 500 μl medium) for 72 hours, individually.

DNA extraction and bisulfite modification
Genomic DNA was extracted from tissue samples with phenol/chloroform as described [17]. The unmethylatedcytosines of the genomic DNA were converted to uridines by addition of 5 mol/L of sodium bisulfite at 50°C overnight [18].

Cloning and sequencing
Above PCR-1, PCR-2 and PCR-w products were cloned into TA-vector and sequenced by ABI 3730 Analyzer. The average number of the sequenced TA clones was 30 for each sample.
Cac8I CORBA of the PCR-2 products After bisulfite conversion, the 156-bp PCR-2 products of methylated Alu templates contain a 5'-GCgnGCg-3'sequence (including CpG sites H and J), that can be digested by Cac8I ( Figure 1A). Thus, Cac8I digestion was used to develop the COBRA assay for detection of Alu methylation. The PCR-2 products (10 μL) were digested with 2 U of Cac8I (New England Biolabs) at 37°C for 6 hours. The 156-bp methylated Alu was cut into 112-bp and 44-bp fragments by Cac8I, whereas the unmethylated Alu was not cut because these cleavage positions did not exist after bisulfite modification. The digested PCR products were further analyzed directly by DHPLC without purification. Equal amount of PCR-2 products of a fully methylated Alu clone was used as the standard Cac8I digestion control in every COBRA experiments. Fully cut of all of the standard control products was used as the indicator of the complete digestion of tested samples.

Separation and quantification of methylated-and unmethylated-Alu by DHPLC
The Cac8I cut methylated-Alu and uncut unmethylated-Alu fragments in the PCR-2 digestion were separated with the WAVE DNA Fragment Analysis System (Transgenomic, Inc., Omaha, USA) at 48°C, the non-denaturing temperature optimised for analysis of double-stranded DNA fragments amplified from bisulfite-modified templates as described previously [19,20], and detected by fluorescence (FL)-detector [21]. The WAVE-HS1 FL-dye buffer (Transgenomic, Inc.) was used to enhance the FL-intensity of PCR products (universal post-column labelling). The apparent total methylated-Alu proportion in tested samples was calculated according to the ratio of the peak height for the 112-bp methylated fragment to the total peak height of both the methylated and 156-bp unmethylated fragment peaks. The peak height of the unmethylated-Alu PCR products was 81.22% of that for the Cac8I digested methylated-Alu PCR products at equal molecule number. Therefore 81.22% was the constant used to adjust the measured peak height for the methylated-Alu.

Data and statistical analysis
The Student's t-test was used to analyze the average methylation frequency of each CpG site within the sequenced clones and the total methylated-Alu proportion in different groups of gastric mucosa samples. The SPSS 16.0 software was used for these statistical analyses. Significance was defined as p < 0.05. The correlation coefficient between methylation of each CpG site and the methylated-Alu was calculated.

Results and Discussions
Analysis of methylation and mutation status of each CpG site within Alu elements in human gastric samples by bisulfite clone sequencing The methylation status of each CpG site within the fulllength Alu elements in both cancer and normal tissues has not previously been reported. Therefore, we carried out extensive bisulfite clone sequencing of the 224-bp PCR-1 products amplified with the primer set-1 ( Figure 1A).  Figure 2A). A significant difference of CpG retention rate was observed at the site L between GC-Nor and normal control samples (P = 0.022).
Cytosine deamination at CpG sites, especially at the methylated-CpG sites, is a frequent event during evolution of Alu elements. The deamination on the antisense strand (antisense-deamination, CpA) was represented as TpA in the bisulfite-modified sequences specifically ( Figure 1B). Significant difference in the antisense-deamination rate was found only at the site O when comparing the GC-Nor and normal samples (34.4% vs. 22.6%, p = 0.046), but not at any site when comparing the GC and GC-Nor (Table 1). These results suggest that the total antisense-deamination level in Alu elements was not significantly changed during carcinogenesis. Therefore, the results of total 427 sequenced clones were pooled together for further analysis.
Price et al. extracted about 480,000 Alu elements via a BLAST search of the human genome and sub-classified them into 213 subfamilies [13]. Using the same data, we did a bioinformatic analysis of the maintenance and variation status at each CpG site within Alu elements. According to this analysis, we found that the average rate of retention of the 17 CpG sites within Alu was up to 89.7% and that average rates of antisense-deamination (CpA), sense-deamination (TpG) and other variations were only 3.1%, 4.6%, and 2.7%, respectively ( Figure 2C). However, in the present study, the average antisensedeamination rate was up to 30.2% among the 472 bisulfite-clones (Table 1). To study whether the high deamination rate in the tested samples was resulted from bisulfite modification bias, we carried out clone sequencing of Alu elements in one representative pair of GC and GC-Nor samples without bisulfite modification with the primer set-w ( Figure 1A and Figure 2B). The results showed the average rate of deamination between the GC (27 clones) and GC-Nor (29 clones) samples was similar: 24.8% and 27.6% for the antisense strand, and 23.5% and 24.5% for the sense strand in the GC and GC-Nor sample, respectively. These results were consistent with 30.2% antisense-deamination rate among the 427 clones. Thus, bisulfite modification bias is unlikely the reason of high deamination rate in the genome observed in the present study.
Yang et al. also reported that based upon sequence analysis of 15 bisulfite clones almost two-thirds of the CpG sites in Alu elements are mutated [14]. The primer set-1 used in the bisulfite clone sequencing covers 470,988 of the extracted 476,152 Alu elements (98.9%). Among the 427 Alu clones, 69.3%, 18.5%, and 12.2% are AluS, AluY, and AluJ, respectively (Table 2)   AluY, and AluJ are 70.3%, 23.4%, and 6.3%, respectively (Table 2). Thus, the possible PCR-1 bias, if any, may not result in favouring amplification of certain kinds of Alu subfamilies, especially for the methylated Alu elements. Because methylation-independent CpG-free primers are favouring the amplification of unmethylated (and evolutionary deaminated) sequences generally, we could not exclude that PCR bias for GC-poor Alu elements might lead to the result of low prevalence of CpG sites within Alu elements in the present study. The bias is likely unavoidable during amplification of the diversified Alu elements with PCR.

Selection of methylated-CpG sites correlated well with methylated-Alu
Based on the sequencing results of the above 427 clones, we found that the average frequency of methylated CpG at each CpG site (based on the consensus Alu sequence) was 23.6%; TpG sites, 33.8%; TpA sites, 30.2%; other kinds of mutations, 12.5% (Table 1). TpG sites represent both the unmethylated CpGs modified with bisulfite and the evolutionary sense-deaminated CpGs. In the case of the average frequency of sense-deamination equal to that of antisense-deamination, as demonstrated with sequencing of the PCR-w products, the frequency of unmethylated CpG on each CpG site was 3.6% (the difference of 33.8% and 30.2%). It means that only 27.2% [the sum of 23.6% and 3.6%] of Alu CpG sites is retained (4.62 CpG sites/clone) and that 86.6% (23.6% of 27.2%) of the remaining methylation target-CpG sites within Alu elements are methylated in the genome.
We further analyzed the distribution of frequencies of clones with different numbers of methylated-CpG sites; and found that 31% of clones (n = 133) contained 0~2 methylated CpG sites and 52% of clones (n = 222) contained 4~14 methylated CpG sites ( Figure 3A, left). . Maintenance, deamination on the sense strand (TpG) and antisense strand (CpA) at each CpG site were analyzed based on 56 tested clone sequences from a gastric mucosa sample without bisulfite modification (B) and on 476,152 copies of Alu elements extracted from the NCBI database of the human genome (C) as described bioinformatically (Ref. [13]).
Based on the phenomenon that the number of methylated CpG sites in clones is negatively correlated with the number of both TpG and TpA in the same clones, respectively ( Figure 3A right, and 3B), we conclude that it is the deamination on the sense and antisense strands, but not demethylation nor unmethylation, is the contributory factor to the low number of methylated CpG sites within these clones.
For localization of the CpG sites with good representation of methylation of Alu elements, the correlation of methylation status between each CpG site and the completely tested fragment of Alu was then calculated. When the clones contain 0-2 and ≥ 4 methylated-CpG sites were defined as the unmethylated and methylated Alu, respectively, the top four correlation coefficients were 0.963 for the site J, 0.950 for the site A, 0.946 for the site H, and 0.945 for the site D. The bottom four were 0.072 for the site I, 0.446 for the site L, 0.677 for the site O, and 0.692 for the site P (Additional file 1, Table S1). Apparently, the site P, which was used as the restriction site in the MboI COBRA assay [14], may not be a good detecting target. The combined coefficient a 476152 copies of Alu elements extracted bioinformatically from the NCBI database of human genome (C) as described (Ref. [13]); b within the full sequence of Alu elements as illustrated in Figure 1A; c amplified from two representative samples with the primer set-2 after bisulfite modification was 0.849 for the sites A-B-C, which was used as MethyLight probe sequence [15], and 0.708 for the sites H and J, which was used in pyrosequencing [14].
Development of a COBRA-DHPLC assay to quantify the proportion of methylated-Alu copies Although the combined coefficients for the sites A-B-C and H&J were lower than their individual coefficients (Additional file 1, Table S1), it is reasonable to expect that comethylation of these CpG sites might have a higher representativity for methylated-Alu than for a single CpG methylation. In fact, the specificity of detection of the proportion of methylated-Alu by comethylation of the sites A-B-C or H&J was up to 100%, whereas for individual CpG methylation it was lower: 97.0% for the site J, 95.5% for the site H, 93.2% for the site A, and 90.2% for the site D, if the Alu clones contained 0~2 methylated CpG sites were considered as unmethylated-Alu. Thus, we used the strategy of detection of comethylation of the sites H&J to develop the following quantitative assay. As mentioned above, pyrosequencing was previously used to detect Alu methylation at H-I-J sites [14]. However, unlike molecule-based multiple CpG sites-assays (i.e. clones or PCR copies) such as methylation-specific PCR, MethyLight, DHPLC, and clone sequencing, pyrosequencing is not a molecule-based assay, as it only provides information on the proportion of methylation at individually tested CpG site in the pooled Alu elements. The pyrosequencing results for different CpG sites might represent different Alu copies, respectively, thus should not be considered as a molecule-based assay. COBRA is one of the most convenient methods for detection of DNA methylation. When more than one CpG site is included in the restriction sequence, COBRA is also a molecule-based multiple CpG sitesassay, which could be used to detect methylated Alu copies among the genome.
To develop a COBRA assay suitable for various kinds of sample storages, such as paraffin embedded tissues, the optimal size of PCR amplicon should be less than 200-bp and a single restriction site should be selected for analysis. However, we could not find a restriction enzyme with a single cut site for the sites A, D, H, and J within the PCR-1 products. Therefore, the primer set-2 was used to amplify the 156-bp PCR-2 product that contains both H and J sites and could be digested by Cac8I (recognition site, 5'-GCN^NGC-3') when the template is methylated ( Figure 1A). The PCR-2 products comethylated at both the sites H and J contain a 5'-GCg^nGCg-3' sequence, thus can be digested into 112-bp and 44-bp fragments by Cac8I (Figure 1A and 4A). Results of clone sequencing of the PCR-2 products from two representative samples showed that the primer set-2 was likely favouring the amplification of AluY clones (38%), which remain more methylation targets than AluS and AluJ ( Table 2).
The traditional COBRA assay is gel-based and could only be used to detect target CpG methylation hemiquantitatively. DHPLC is a typical separation and quantification method that could also be used to detect DNA fragments and methylation of CpG islands, whether or not it is combined with other assays (19,20,22). To detect the methylated-Alu proportion accurately, DHPLC was used to separate and quantify the Cac8Icut (112-bp, methylated) and -uncut (156-bp, unmethylated) fragments. The methylated-and unmethylated-Alu in the digestion could be separated by DHPLC under the completely non-denaturing temperature 48°C. The retention time for the methylated-and unmethylated-Alu peak was 3.3 and 4.7 min, respectively. A linear relationship over a wide range could be observed between the loading concentration (1/1~1/64) and ratio of peak height for the methylated-Alu products (y = 0.9912x, R 2 = 0.995) ( Figure 4B). The detection limit of this assay was about 3.4 × 10 6 copies of methylated Alu elements (the total copy number of Alu within one diploid cell is 2 × 10 6 ). The coefficient of variation (CV) of this assay was 7.4%. Because of the very high specificity (100%) of comethylation at both the sites H and J for the methylated Alu, we used ratio of the 112-bp methylated Alu peak to the sum of the methylated and 156-bp unmethylated Alu peaks to represent the proportion of the methylated Alu copies in the tested samples, as described on the method section.
Both hypomethylation of Alu elements in 5-aza-dC treated AGS (2.12% 1.90%) or SW480 cell lines (2.28% 1.88%) and hypermethylation of Alu elements in M.SssI-methylated DNA templates (2.26% 2.44% for AGS; 2.21% 2.55% for SW480) could sensitively be detected by the Cac8I COBRA-DHPLC assay successfully ( Figure 4C). It was reported that global genomic 5-methylcytosine content in the human genome was tissue-specific with a range of 3.43-4.26% of cytosine residues methylated in normal tissues (15,23,24). We used the Cac8I COBRA-DHPLC method to detect the methylated-Alu proportion in 48 pairs of GCs and GC-Nor and 55 gastric mucosa biopsy samples from noncancerous patients. Results showed that the average methylated-Alu proportion in GCs (%, mean ± SD, 3.01 ± 0.25) was significantly lower than that in GC-Nor (3.19 ± 0.31) ( Figure 5; P < 0.01) and that in gastric biopsies from patients without tumor (3.23 ± 0.57; P < 0.001; data not shown). We did not observed any significant association between the methylated-Alu proportion in GCs and patients' clinical-pathological characteristics, such as lymph node metastasis, age, and sex (data not shown). This result is consistent with the hypothesis Figure 5 Comparison of proportion of methylated-Alu in primary gastric carcinomas and the corresponding normal samples by the Cac8I COBRA-DHPLC assay. Bisulfite modified Alu elements in 48 pairs of gastric carcinomas (GC) and the corresponding normal tissues (GC-Nor) with or without lymph node metastasis (M+/M-) were amplified with the primer set-2. The 156-bp PCR-2 products of Alu elements were digested with Cac8I at 37°C for 6 hours. The methylated-Alu was cut and the unmethylated-Alu was not cut by Cac8I. The digested PCR-2 products were separated by DHPLC at the undenatured temperature 48°C. The proportion of methylated-Alu was calculated according to ratio of the adjusted peak height for the methylated-Alu to that for the unmethylated-Alu. Hypomethylation was observed in 33 of 48 of GCs and marked with the colour blue. Figure 4 Chromatography of Cac8I digestion of methylated and unmethylated PCR-2 products. The electrophoresis image of the PCR-2 products of methylated-and unmethylated-Alu clones with and without Cac8I digestion (A); DHPLC chromatography of Cac8I digested products after the methylated-Alu PCR-2 products was diluted with the unmethylated Alu PCR-2 products at various ratios (B); After 5-aza-dC treatment (10 μM) or M.SssI-modification, changes of the methylated-Alu proportion could be detected by the Cac8I COBRA-DHPLC assay sensitively (C). Open arrow and gray arrow point to peaks that correspond to the methylated-and unmethylated-Alu fragments, respectively. The gray square area is the single direction magnified part of open dash-line enclosed area.

Conclusions
Most Alu CpG sites are deaminated in the genome. 27% of Alu CpG sites represented in our amplification products. 87% of the remaining CpG sites are methylated. Based on the analysis of extensive bisulfite clone sequencings, a Cac8I COBRA-DHPLC assay was developed to quantify sensitively the methylated-Alu proportion. Hypomethylation of Alu elements was observed in gastric carcinomas with the assay.
Additional file 1: Table S1. -Correlation coefficients of methylation status between each CpG site and Alu clone subgroups with various methylated CpGs. All of 427 Alu clones were classified into different subgroups according to the methylated-CpG number. Average methylation frequency of each CpG site within each Alu subgroup was calculated (methylated-CpG number of the CpG site to the clone number of Alu within the subgroup). The correlation coefficient was calculated for each CpG site based on the corresponding average methylation frequency within each subgroup and the total methylated-CpG number within the subgroup. Click here for file [ http://www.biomedcentral.com/content/supplementary/1471-2407-10-44-S1.XLS ]