Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter February 24, 2016

Comparing five statistical methods of differential methylation identification using bisulfite sequencing data

  • Xiaoqing Yu and Shuying Sun EMAIL logo

Abstract

We are presenting a comprehensive comparative analysis of five differential methylation (DM) identification methods: methylKit, BSmooth, BiSeq, HMM-DM, and HMM-Fisher, which are developed for bisulfite sequencing (BS) data. We summarize the features of these methods from several analytical aspects and compare their performances using both simulated and real BS datasets. Our comparison results are summarized below. First, parameter settings may largely affect the accuracy of DM identification. Different from default settings, modified parameter settings yield higher sensitivity and/or lower false positive rates. Second, all five methods show more accurate results when identifying simulated DM regions that are long and have small within-group variation, but they have low concordance, probably due to the different approaches they have used for DM identification. Third, HMM-DM and HMM-Fisher yield relatively higher sensitivity and lower false positive rates than others, especially in DM regions with large variation. Finally, we have found that among the three methods that involve methylation estimation (methylKit, BSmooth, and BiSeq), BiSeq can best present raw methylation signals. Therefore, based on these results, we suggest that users select DM identification methods based on the characteristics of their data and the advantages of each method.


Corresponding author: Shuying Sun, Department of Mathematics, Texas State University, San Marcos, TX 78666, USA, e-mail:

Acknowledgments

This work is supported by Dr. Shuying Sun’s start-up funds and the Research Enhancement Program provided by Texas State University. We are very grateful for three anonymous reviewers’ comments and suggestions, which help us improve this manuscript greatly.

References

Akalin, A., M. Kormaksson, S. Li, F. E. Garrett-Bakelman, M. E. Figueroa, A. Melnick and C. E. Mason (2012): “methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles,” Genome Biol., 13, R87.Search in Google Scholar

Akman, K., T. Haaf, S. Gravina, J. Vijg and A. Tresch (2014): “Genome-wide quantitative analysis of DNA methylation from bisulfite sequencing data,” Bioinformatics, 30, 1933–1934.10.1093/bioinformatics/btu142Search in Google Scholar

Aryee, M. J., A. E. Jaffe, H. Corrada-Bravo, C. Ladd-Acosta, A. P. Feinberg, K. D. Hansen and R. A. Irizarry (2014): “Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays,” Bioinformatics, 30, 1363–1369.10.1093/bioinformatics/btu049Search in Google Scholar

Baylin, S. and T. H. Bestor (2002): “Altered methylation patterns in cancer cell genomes: Cause or consequence?,” Cancer Cell, 1, 299–305.10.1016/S1535-6108(02)00061-2Search in Google Scholar

Becker, C., J. Hagmann, J. Muller, D. Koenig, O. Stegle, K. Borgwardt and D. Weigel (2011): “Spontaneous epigenetic variation in the Arabidopsis thaliana methylome,” Nature, 480, 245–249.10.1038/nature10555Search in Google Scholar PubMed

Benjamini, Y. and R. Heller (2007): “False discovery rates for spatial signals,” J. Am. Stat. Assoc., 102, 1272–1281.Search in Google Scholar

Benjamini, Y. and Y. Hochberg (1997): “Multiple hypotheses testing with weights,” Scand. J. Stat., 24, 407–418.Search in Google Scholar

Benjamini, Y., A. M. Krieger and D. Yekutieli (2006): “Adaptive linear step-up procedures that control the false discovery rate,” Biometrika, 93, 491–507.10.1093/biomet/93.3.491Search in Google Scholar

Bock, C. (2012): “Analysing and interpreting DNA methylation data,” Anglais, 13, 705–719.10.1038/nrg3273Search in Google Scholar PubMed

Butcher, L. M. and S. Beck (2015): “Probe Lasso: A novel method to rope in differentially methylated regions with 450K DNA methylation data,” Methods (San Diego, Calif.), 72, 21–28.Search in Google Scholar

Challen, G. A., D. Sun, M. Jeong, M. Luo, J. Jelinek, J. S. Berg, C. Bock, A. Vasanthakumar, H. Gu, Y. Xi, S. Liang, Y. Lu, G. J. Darlington, A. Meissner, J.-P. J. Issa, L. A. Godley, W. Li and M. A. Goodell (2011): “Dnmt3a is essential for hematopoietic stem cell differentiation,” Nat. Genet., 44, 23–31.Search in Google Scholar

Dolzhenko, E. and A. D. Smith (2014): “Using beta-binomial regression for high-precision differential methylation analysis in multifactor whole-genome bisulfite sequencing experiments,” BMC Bioinformatics, 15, 215–215.10.1186/1471-2105-15-215Search in Google Scholar PubMed PubMed Central

Du, P. and R. Bourgon (2014): “methyAnalysis: DNA methylation data analysis and visualization,” R package version 1.10.0.Search in Google Scholar

Eckhardt, F., J. Lewin, R. Cortese, V. K. Rakyan, J. Attwood, M. Burger, J. Burton, T. V. Cox, R. Davies, T. A. Down, C. Haefliger, R. Horton, K. Howe, D. K. Jackson, J. Kunde, C. Koenig, J. Liddle, D. Niblett, T. Otto, R. Pettett, S. Seemann, C. Thompson, T. West, J. Rogers, A. Olek, K. Berlin and S. Beck (2006): “DNA methylation profiling of human chromosomes 6, 20 and 22,” Nat. Genet., 38, 1378–1385.Search in Google Scholar

Feng, H., K. N. Conneely and H. Wu (2014): “A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data,” Nucleic Acids Res., 42, e69–e69.Search in Google Scholar

Gopalakrishnan, S., B. O. Van Emburgh and K. D. Robertson (2008): “DNA methylation in development and human disease,” Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis, 647, 30–38.10.1016/j.mrfmmm.2008.08.006Search in Google Scholar PubMed PubMed Central

Gu, H., C. Bock, T. S. Mikkelsen, N. Jager, Z. D. Smith, E. Tomazou, A. Gnirke, E. S. Lander and A. Meissner (2010): “Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution,” Nat. Methods, 7, 133–136.Search in Google Scholar

Gu, H., Z. D. Smith, C. Bock, P. Boyle, A. Gnirke and A. Meissner (2011): “Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling,” Nat. Protoc., 6, 468–481.Search in Google Scholar

Guzman, L., M. Depix, A. Salinas, R. Roldan, F. Aguayo, A. Silva and R. Vinet (2012): “Analysis of aberrant methylation on promoter sequences of tumor suppressor genes and total DNA in sputum samples: a promising tool for early detection of COPD and lung cancer in smokers,” Diagn. Pathol., 7, 87.10.1186/1746-1596-7-87Search in Google Scholar PubMed PubMed Central

Hansen, K., B. Langmead and R. Irizarry (2012): “BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions,” Genome Biol., 13, R83.Search in Google Scholar

Hansen, K. D., W. Timp, H. C. Bravo, S. Sabunciyan, B. Langmead, O. G. McDonald, B. Wen, H. Wu, Y. Liu, D. Diep, E. Briem, K. Zhang, R. A. Irizarry and A. P. Feinberg (2011): “Increased methylation variation in epigenetic domains across cancer types,” Nat. Genet., 43, 768–775.Search in Google Scholar

Harris, E. Y., N. Ponts, A. Levchuk, K. L. Roch and S. Lonardi (2010): “BRAT: bisulfite-treated reads analysis tool,” Bioinformatics, 26, 572–573.10.1093/bioinformatics/btp706Search in Google Scholar PubMed PubMed Central

Hebestreit, K., M. Dugas and H. U. Klein (2013): “Detection of significantly differentially methylated regions in targeted bisulfite sequencing data,” Bioinformatics, 29, 1647–1653.10.1093/bioinformatics/btt263Search in Google Scholar PubMed

Irizarry, R. A., C. Ladd-Acosta, B. Wen, Z. Wu, C. Montano, P. Onyango, H. Cui, K. Gabo, M. Rongione, M. Webster, H. Ji, J. B. Potash, S. Sabunciyan and A. P. Feinberg (2009): “The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores,” Nat. Genet., 41, 178–186.Search in Google Scholar

Jaffe, A. E., P. Murakami, H. Lee, J. T. Leek, M. D. Fallin, A. P. Feinberg and R. A. Irizarry (2012): “Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies,” Int. J. Epidemiol., 41, 200–209.Search in Google Scholar

Jayanth, N. and M. Puranik (2011): “Methylation stabilizes the imino tautomer of dAMP and amino tautomer of dCMP in solution,” J. Phys. Chem. B, 115, 6234–6242.Search in Google Scholar

Jiang, P., K. Sun, F. M. F. Lun, A. M. Guo, H. Wang, K. C. A. Chan, R. W. K. Chiu, Y. M. D. Lo and H. Sun (2014): “Methy-pipe: an integrated bioinformatics pipeline for whole genome bisulfite sequencing data analysis,” PLoS ONE, 9, e100360.10.1371/journal.pone.0100360Search in Google Scholar PubMed PubMed Central

Law, J. A. and S. E. Jacobsen (2010): “Establishing, maintaining and modifying DNA methylation patterns in plants and animals,” Anglais, 11, 204–220.10.1038/nrg2719Search in Google Scholar PubMed PubMed Central

Li, S., F. Garrett-Bakelman, A. Akalin, P. Zumbo, R. Levine, B. To, I. Lewis, A. Brown, R. D’Andrea, A. Melnick and C. Mason (2013): “An optimized algorithm for detecting and annotating regional differential methylation,” BMC Bioinformatics, 14, S10.10.1186/1471-2105-14-S5-S10Search in Google Scholar PubMed PubMed Central

Li, Y., J. Zhu, G. Tian, N. Li, Q. Li, M. Ye, H. Zheng, J. Yu, H. Wu, J. Sun, H. Zhang, Q. Chen, R. Luo, M. Chen, Y. He, X. Jin, Q. Zhang, C. Yu, G. Zhou, J. Sun, Y. Huang, H. Zheng, H. Cao, X. Zhou, S. Guo, X. Hu, X. Li, K. Kristiansen, L. Bolund, J. Xu, W. Wang, H. Yang, J. Wang, R. Li, S. Beck, J. Wang and X. Zhang (2010): “The DNA Methylome of Human Peripheral Blood Mononuclear Cells,” PLoS Biology, 8, e1000533.10.1371/journal.pbio.1000533Search in Google Scholar PubMed PubMed Central

Lister, R., M. Pelizzola, R. H. Dowen, R. D. Hawkins, G. Hon, J. Tonti-Filippini, J. R. Nery, L. Lee, Z. Ye, Q. M. Ngo, L. Edsall, J. Antosiewicz-Bourget, R. Stewart, V. Ruotti, A. H. Millar, J. A. Thomson, B. Ren and J. R. Ecker (2009): “Human DNA methylomes at base resolution show widespread epigenomic differences,” Nature, 462, 315–322.10.1038/nature08514Search in Google Scholar PubMed PubMed Central

Park, Y., M. E. Figueroa, L. S. Rozek and M. A. Sartor (2014): “MethylSig: a whole genome DNA methylation analysis pipeline,” Bioinformatics, 30, 2414–2422.10.1093/bioinformatics/btu339Search in Google Scholar PubMed PubMed Central

Pawitan, Y., S. Michiels, S. Koscielny, A. Gusnanto and A. Ploner (2005): “False discovery rate, sensitivity and sample size for microarray studies,” Bioinformatics, 21, 3017–3024.10.1093/bioinformatics/bti448Search in Google Scholar PubMed

Peters, T. J., M. J. Buckley, A. L. Statham, R. Pidsley, K. Samaras, R. V Lord, S. J. Clark and P. L. Molloy (2015): “De novo identification of differentially methylated regions in the human genome,” Epigenetics Chromatin, 8, 6.10.1186/1756-8935-8-6Search in Google Scholar PubMed PubMed Central

Robinson, M. D., A. Kahraman, C. W. Law, H. Lindsay, M. Nowicka, L. M. Weber and X. Zhou (2014): “Statistical methods for detecting differentially methylated loci and regions,” Front. Genet., 5, 324.Search in Google Scholar

Saito, Y., J. Tsuji and T. Mituyama (2014): “Bisulfighter: accurate detection of methylated cytosines and differentially methylated regions,” Nucleic Acids Res., 42, e45.Search in Google Scholar

Sofer, T., E. D. Schifano, J. A. Hoppin, L. Hou and A. A. Baccarelli (2013): “A-clustering: a novel method for the detection of co-regulated methylation regions, and regions associated with exposure,” Bioinformatics, 29, 2884–2891.10.1093/bioinformatics/btt498Search in Google Scholar PubMed PubMed Central

Song, Q., B. Decato, E. E. Hong, M. Zhou, F. Fang, J. Qu, T. Garvin, M. Kessler, J. Zhou and A. D. Smith (2013): “A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics,” PLoS ONE, 8, e81148.10.1371/journal.pone.0081148Search in Google Scholar PubMed PubMed Central

Storey, J. D. (2002): “A direct approach to false discovery rates,” J Roy Stat Soc B Met, 64, 479–498.10.1111/1467-9868.00346Search in Google Scholar

Storey, J. D. and R. Tibshirani (2003): “Statistical significance for genomewide studies,” Proc. Natl. Acad. Sci., 100, 9440–9445.Search in Google Scholar

Strathdee, G. and R. Brown (2002): “Aberrant DNA methylation in cancer: potential clinical interventions,” Expert Rev. Mol. Med., 4, 1–17.Search in Google Scholar

Su, J., H. Yan, Y. Wei, H. Liu, H. Liu, F. Wang, J. Lv, Q. Wu and Y. Zhang (2013): “CpG_MPs: identification of CpG methylation patterns of genomic regions from high-throughput bisulfite sequencing data,” Nucleic Acids Res., 41, e4–e4.10.1093/nar/gks829Search in Google Scholar PubMed PubMed Central

Sun, D., Y. Xi, B. Rodriguez, H. Park, P. Tong, M. Meong, M. Goodell and W. Li (2014): “MOABS: model based analysis of bisulfite sequencing data,” Genome Biol., 15, R38.Search in Google Scholar

Sun, S. and X. Yu (2016a): “HMM-Fisher: identifying differential methylation using a hidden Markov model and Fisher’s exact test,” Stat. Appl. Genet. Mol. Biol., 15, 55–67.10.1515/sagmb-2015-0076Search in Google Scholar PubMed

Sun, S. and X. Yu (2016b): “HMM-Fisher,” GitHub repository, https://github.com/xxy39/HMM-Fisher.Search in Google Scholar

Sun, Z., Y. W. Asmann, K. R. Kalari, B. Bot, J. E. Eckel-Passow, T. R. Baker, J. M. Carr, I. Khrebtukova, S. Luo, L. Zhang, G. P. Schroth, E. A. Perez and E. A. Thompson (2011): “Integrated analysis of gene expression, CpG Island methylation, and gene copy number in breast cancer cells by deep sequencing,” PLoS ONE, 6, e17490.10.1371/journal.pone.0017490Search in Google Scholar PubMed PubMed Central

Suzuki, M. and A. Bird (2008): “DNA methylation landscapes: provocative insights from epigenomics,” Anglais, 9, 465–476.10.1038/nrg2341Search in Google Scholar PubMed

Wang, D., L. Yan, Q. Hu, L. E. Sucheston, M. J. Higgins, C. B. Ambrosone, C. S. Johnson, D. J. Smiraglia and S. Liu (2012): “IMA: an R package for high-throughput analysis of Illumina’s 450K Infinium methylation data,” Bioinformatics, 28, 729–730.10.1093/bioinformatics/bts013Search in Google Scholar PubMed PubMed Central

Wang, H., L. Tuominen and C. Tsai (2011): “SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures,” Bioinformatics, 27, 225–231.10.1093/bioinformatics/btq650Search in Google Scholar PubMed

Wei, S., R. Brown and T. Huang (2003): “Aberrant DNA methylation in ovarian cancer: is there an epigenetic predisposition to drug response?,” Ann. N. Y. Acad Sci., 983, 243–250.Search in Google Scholar

Xu, H., R. H. Podolsky, D. Ryu, X. Wang, S. Su, H. Shi and V. George (2013): “A method to detect differentially methylated loci with next-generation sequencing,” Genet Epidemiol., 37, 377–382.Search in Google Scholar

Yu, X. and S. Sun (2016a): “HMM-DM: identifying differentially methylated regions using a hidden Markov model,” Stat. Appl. Genet. Mol. Biol., 15, 69–81.10.1515/sagmb-2015-0077Search in Google Scholar PubMed

Yu, X. and S. Sun (2016b): “HMM-DM,” GitHub repository, https://github.com/xxy39/HMM-DM.Search in Google Scholar

Zhang, Y., H. Liu, J. Lv, X. Xiao, J. Zhu, X. Liu, J. Su, X. Li, Q. Wu, F. Wang and Y. Cui (2011): “QDMR: a quantitative method for identification of differentially methylated regions by entropy,” Nucleic Acids Res., 39, e58–e58.Search in Google Scholar


Supplemental Material:

The online version of this article (DOI: 10.1515/sagmb-2015-0078) offers supplementary material, available to authorized users.


Published Online: 2016-2-24
Published in Print: 2016-4-1

©2016 by De Gruyter

Downloaded on 12.5.2024 from https://www.degruyter.com/document/doi/10.1515/sagmb-2015-0078/html
Scroll to top button