Skip to main content

Bioinformatics Analysis for Cell-Free Tumor DNA Sequencing Data

  • Protocol
Computational Systems Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1754))

Abstract

As a major biomarker of liquid biopsy, cell-free tumor DNA (ctDNA), which can be extracted from blood, urine, or other circulating liquids, is able to provide comprehensive genetic information of tumor and better overcome the tumor heterogeneity problem comparing to tissue biopsy. Developed in recent years, next-generation sequencing (NGS) is a widely used technology for analyzing ctDNA. Although the technologies of processing ctDNA samples are mature, the task to detect low mutated allele frequency (MAF) variations from noisy sequencing data remains challenging. In this chapter, the authors will first explain the difficulties of analyzing ctDNA sequencing data, review related technologies, and then present some novel bioinformatics methods for analyzing ctDNA NGS data in better ways.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kohler CBZ, Radpour R et al (2011) Cell-free DNA in the circulation as a potential cancer biomarker. Anticancer Res 31:2623–2628

    CAS  PubMed  Google Scholar 

  2. Diehl F, Schmidt K, Choti MA, Romans K, Goodman S, Li M, Thornton K, Agrawal N, Sokoll L, Szabo SA, Kinzler KW, Vogelstein B, Diaz LA Jr (2008) Circulating mutant DNA to assess tumor dynamics. Nat Med 14(9):985–990. https://doi.org/10.1038/nm.1789

    Article  CAS  PubMed  Google Scholar 

  3. Heitzer E, Ulz P, Geigl JB (2015) Circulating tumor DNA as a liquid biopsy for cancer. Clin Chem 61(1):112–123. https://doi.org/10.1373/clinchem.2014.222679

    Article  CAS  PubMed  Google Scholar 

  4. Leon SASB, Sklaroff DM et al (1977) Free DNA in the serum of cancer patients and the effect of therapy. Cancer Res 37:646–650

    CAS  PubMed  Google Scholar 

  5. Beaver JA, Jelovac D, Balukrishna S, Cochran RL, Croessmann S, Zabransky DJ, Wong HY, Valda Toro P, Cidado J, Blair BG, Chu D, Burns T, Higgins MJ, Stearns V, Jacobs L, Habibi M, Lange J, Hurley PJ, Lauring J, VanDenBerg DA, Kessler J, Jeter S, Samuels ML, Maar D, Cope L, Cimino-Mathews A, Argani P, Wolff AC, Park BH (2014) Detection of cancer DNA in plasma of patients with early-stage breast cancer. Clin Cancer Res 20(10):2643–2650. https://doi.org/10.1158/1078-0432.CCR-13-2933

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Fox EJ, Reid-Bayliss KS, Emond MJ, Loeb LA (2014) Accuracy of next generation sequencing platforms. Next Gener Seq Appl 1. https://doi.org/10.4172/jngsa.1000106

  7. Arbeithuber B, Makova KD, Tiemann-Boege I (2016) Artifactual mutations resulting from DNA lesions limit detection levels in ultrasensitive sequencing applications. DNA Res 23(6):547–559. https://doi.org/10.1093/dnares/dsw038

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Lixin Chen PL (2017) DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification. Science 355(6326):752–756

    Article  PubMed  Google Scholar 

  9. Underhill HR, Kitzman JO, Hellwig S, Welker NC, Daza R, Baker DN, Gligorich KM, Rostomily RC, Bronner MP, Shendure J (2016) Fragment length of circulating tumor DNA. PLoS Genet 12(7):e1006162. https://doi.org/10.1371/journal.pgen.1006162

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Chen S, Huang T, Zhou Y, Han Y, Xu M, Gu J (2017) AfterQC: automatic filtering, trimming, error removing and quality control for fastq data. BMC Bioinformatics 18(Suppl 3; 80):91–100. https://doi.org/10.1186/s12859-017-1469-3

    Article  CAS  Google Scholar 

  12. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359. https://doi.org/10.1038/nmeth.1923

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26(5):589–595. https://doi.org/10.1093/bioinformatics/btp698

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Tarasov A, Viella AJ, Cuppen E, Nijman IJ, Prins P (2015) Sambamba: fast processing of NGS alignment formats. Bioinformatics. https://doi.org/10.5281/zenodo.13200

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Mose LE, Wilkerson MD, Hayes DN, Perou CM, Parker JS (2014) ABRA: improved coding indel detection via assembly-based realignment. Bioinformatics 30(19):2813–2815. https://doi.org/10.1093/bioinformatics/btu376

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Garcia-Alcalde F, Okonechnikov K, Carbonell J, Cruz LM, Gotz S, Tarazona S, Dopazo J, Meyer TF, Conesa A (2012) Qualimap: evaluating next-generation sequencing alignment data. Bioinformatics 28(20):2678–2679. https://doi.org/10.1093/bioinformatics/bts503

    Article  CAS  PubMed  Google Scholar 

  18. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303. https://doi.org/10.1101/gr.107524.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31(3):213–219. https://doi.org/10.1038/nbt.2514

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576. https://doi.org/10.1101/gr.129684.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38(16):e164. https://doi.org/10.1093/nar/gkq603

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18):i333–i339. https://doi.org/10.1093/bioinformatics/bts378

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Newman AM, Bratman SV, Stehr H, Lee LJ, Liu CL, Diehn M, Alizadeh AA (2014) FACTERA: a practical method for the discovery of genomic rearrangements at breakpoint resolution. Bioinformatics 30(23):3390–3393. https://doi.org/10.1093/bioinformatics/btu549

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Wang K, Ma Q, Jiang L, Lai S, Lu X, Hou Y, Wu CI, Ruan J (2016) Ultra-precise detection of mutations by droplet-based amplification of circularized DNA. BMC Genomics 17:214. https://doi.org/10.1186/s12864-016-2480-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Schmieder R, Edwards R (2011) Quality control and preprocessing of metagenomic datasets. Bioinformatics 27(6):863–864. https://doi.org/10.1093/bioinformatics/btr026

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Cox MP, Peterson DA, Biggs PJ (2010) SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11(1):485. https://doi.org/10.1186/1471-2105-11-485

    Article  PubMed  PubMed Central  Google Scholar 

  27. Meldrum C, Doyle MA, Tothill RW (2011) Next-generation sequencing for cancer diagnostics a practical perspective. Clin Biochem Rev 32(4):177–195

    PubMed  PubMed Central  Google Scholar 

  28. Tindall KRKT (1988) Fidelity of DNA synthesis by the Thermus aquaticus DNA polymerase. Biochemistry 27:6008–6013

    Article  CAS  PubMed  Google Scholar 

  29. Kinde IWJ, Papadopoulos N, Kinzler KW, Vogelstein B (2011) Detection and quantification of rare mutations with. Proc Natl Acad Sci U S A 108(23):9530–9535

    Article  PubMed  PubMed Central  Google Scholar 

  30. Liang RH, Mo T, Dong W, Lee GQ, Swenson LC, McCloskey RM, Woods CK, Brumme CJ, Ho CK, Schinkel J, Joy JB, Harrigan PR, Poon AF (2014) Theoretical and experimental assessment of degenerate primer tagging in ultra-deep applications of next-generation sequencing. Nucleic Acids Res 42(12):e98. https://doi.org/10.1093/nar/gku355

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Hoffmann C, Minkah N, Leipzig J, Wang G, Arens MQ, Tebas P, Bushman FD (2007) DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutations. Nucleic Acids Res 35(13):e91. https://doi.org/10.1093/nar/gkm435

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Kivioja T, Vaharautio A, Karlsson K, Bonke M, Enge M, Linnarsson S, Taipale J (2011) Counting absolute numbers of molecules using unique molecular identifiers. Nat Methods 9(1):72–74. https://doi.org/10.1038/nmeth.1778

    Article  CAS  PubMed  Google Scholar 

  33. Michael W, Schmitta SRK, Salka JJ, Foxa EJ, Hiattb JB, Loeba LA (2012) Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A 109:14508–14513

    Article  Google Scholar 

  34. Kennedy SR, Schmitt MW, Fox EJ, Kohrn BF, Salk JJ, Ahn EH, Prindle MJ, Kuong KJ, Shen JC, Risques RA, Loeb LA (2014) Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc 9(11):2586–2606. https://doi.org/10.1038/nprot.2014.170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948. https://doi.org/10.1093/bioinformatics/btm404

    Article  CAS  PubMed  Google Scholar 

  36. Kirsch A, Mitzenmacher M (2008) Less hashing, same performance: building a better bloom filter. Random Struct Algor 33(2):187–218. https://doi.org/10.1002/rsa.20208

    Article  Google Scholar 

  37. Newman AM, Bratman SV, To J, Wynne JF, Eclov NC, Modlin LA, Liu CL, Neal JW, Wakelee HA, Merritt RE, Shrager JB, Loo BW Jr, Alizadeh AA, Diehn M (2014) An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med 20(5):548–554. https://doi.org/10.1038/nm.3519

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Jones SBBPA (2011) A decade of exploring the cancer epigenome – biological and translational implications. Nat Rev Cancer 11(10):726–734. https://doi.org/10.1038/nrc3130

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Warton K, Samimi G (2015) Methylation of cell-free circulating DNA in the diagnosis of cancer. Front Mol Biosci 2:13. https://doi.org/10.3389/fmolb.2015.00013

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Heyn H, Esteller M (2012) DNA methylation profiling in the clinic: applications and challenges. Nat Rev Genet 13(10):679–692. https://doi.org/10.1038/nrg3270

    Article  CAS  PubMed  Google Scholar 

  41. Laird PW (2010) Principles and challenges of genomewide DNA methylation analysis. Nat Rev Genet 11(3):191–203. https://doi.org/10.1038/nrg2732

    Article  CAS  PubMed  Google Scholar 

  42. Frommer MML, Millar DS, Collis CM, Watt F, Grigg GW et al (1992) A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci U S A 89(18):27–31

    Google Scholar 

  43. Urich MA, Nery JR, Lister R, Schmitz RJ, Ecker JR (2015) MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing. Nat Protoc 10(3):475–483. https://doi.org/10.1038/nprot.2014.114

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A (2011) Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Protoc 6(4):468–481. https://doi.org/10.1038/nprot.2010.190

    Article  CAS  PubMed  Google Scholar 

  45. deVos T, Tetzner R, Model F, Weiss G, Schuster M, Distler J, Steiger KV, Grutzmann R, Pilarsky C, Habermann JK, Fleshner PR, Oubre BM, Day R, Sledziewski AZ, Lofton-Day C (2009) Circulating methylated SEPT9 DNA in plasma is a biomarker for colorectal cancer. Clin Chem 55(7):1337–1346. https://doi.org/10.1373/clinchem.2008.115808

    Article  CAS  PubMed  Google Scholar 

  46. Guo S, Diep D, Plongthongkum N, Fung HL, Zhang K, Zhang K (2017) Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA. Nat Genet 49(4):635–642. https://doi.org/10.1038/ng.3805

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Lin X, Sun D, Rodriguez B, Zhao Q, Sun H, Zhang Y, Li W (2013) BSeQC: quality control of bisulfite sequencing experiments. Bioinformatics 29(24):3227–3229. https://doi.org/10.1093/bioinformatics/btt548

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Sun S, Noviski A, Yu X (2013) MethyQA: a pipeline for bisulfite-treated methylation sequencing quality assessment. BMC Bioinformatics 14:259

    Article  PubMed  PubMed Central  Google Scholar 

  49. Krueger F, Kreck B, Franke A, Andrews SR (2012) DNA methylome analysis using short bisulfite sequencing data. Nat Methods 9(2):145–151

    Article  CAS  PubMed  Google Scholar 

  50. Adusumalli S, Mohd Omar MF, Soong R, Benoukraf T (2014) Methodological aspects of whole-genome bisulfite sequencing analysis. Brief Bioinform 16(3):369–379. https://doi.org/10.1093/bib/bbu016

    Article  CAS  PubMed  Google Scholar 

  51. Xi Y, Li W (2009) BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 10:232. https://doi.org/10.1186/1471-2105-10-232

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Krueger F, Andrews SR (2011) Bismark: a flexible aligner and methylation caller for Bisulfite-seq applications. Bioinformatics 27(11):1571–1572. https://doi.org/10.1093/bioinformatics/btr167

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Benoukraf T, Wongphayak S, Hadi LH, Wu M, Soong R (2013) GBSA: a comprehensive software for analysing whole genome bisulfite sequencing data. Nucleic Acids Res 41(4):e55. https://doi.org/10.1093/nar/gks1281

    Article  CAS  PubMed  Google Scholar 

  54. Akalin A, Kormaksson M, Li S, Garrett-Bakelman FE, Figueroa ME, Melnick A, Mason CE (2012) methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol 13:R87

    Article  PubMed  PubMed Central  Google Scholar 

  55. Chandrananda D, Thorne NP, Bahlo M (2015) High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA. BMC Med Genet 8:29. https://doi.org/10.1186/s12920-015-0107-z

    Article  CAS  Google Scholar 

  56. Efron B, Tibshirani R (1997) Improvements on cross-validation: the .632+ bootstrap method. J Am Stat Assoc 92(438):548–560

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shifu Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Cite this protocol

Chen, S., Liu, M., Zhou, Y. (2018). Bioinformatics Analysis for Cell-Free Tumor DNA Sequencing Data. In: Huang, T. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 1754. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7717-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7717-8_5

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7716-1

  • Online ISBN: 978-1-4939-7717-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics