Elsevier

Current Opinion in Systems Biology

Volume 7, February 2018, Pages 16-25
Current Opinion in Systems Biology

Computational approaches for inferring tumor evolution from single-cell genomic data

https://doi.org/10.1016/j.coisb.2017.11.008Get rights and content

Highlights

  • Single-cell genomics provides the highest resolution for understanding intratumor heterogeneity and cancer evolution.

  • Experimental Innovations are developed to overcome the unique technical challenges.

  • Computational innovations are emerging for tumor clonality and phylogeny inference.

Abstract

Genomic heterogeneity in tumors results from mutations and selection of high-fitness single cells, the operational components of evolution. Precise knowledge about mutational heterogeneity and evolutionary trajectory of a tumor can provide useful insights into predicting cancer progression and designing personalized treatment. The rapidly advancing field of single-cell genomics provides an opportunity to study tumor heterogeneity and evolution at the ultimate level of resolution. In this review, we present an overview of the state-of-the-art single-cell DNA sequencing methods, technical errors that are inherent in the resulting large-scale datasets, and computational methods to overcome these errors. Finally, we discuss the computational and mathematical approaches for understanding intratumor heterogeneity and cancer evolution at the resolution of a single cell.

Introduction

Cancer is a disease emerging from a single cell in the somatic tissue and is driven by a complex interplay of somatic mutations, copy number alterations (CNAs) and chromosomal rearrangements 1, 2. As a tumor progresses, diverse genomic aberrations give rise to genetically heterogeneous subpopulations (clones) of cells interacting with each other in a Darwinian framework of mutations, fitness and selection 3, 4, 5. Intratumor heterogeneity (ITH) complicates the diagnosis and treatment of cancer patients and causes relapse and drug resistance 6, 7, 8. The emergence of next-generation sequencing (NGS) technologies enabled a thorough analysis of tumor heterogeneity through the generation of large-scale quantitative genomic datasets 9, 10, 11. However, despite these advances, a comprehensive understanding of ITH has proved elusive thus far 12, 13.

Bulk high-throughput sequencing has been the technology of choice for studying heterogeneity and tumor evolution 14, 15. Subpopulations are computationally inferred 16, 17, 18, 19, 20, 21, 22 from variant allele frequencies (VAFs) of mutations detected in bulk DNA that consists of an admixture of DNA from millions of cells in a cancer tissue. VAFs, however, provide a noisy signal for deconvoluting heterogeneity 23, 24 and cannot reliably reconstruct rare subclones, or subclones having similar frequencies in the tumor mass. The single-sample approach of bulk sequencing is augmented in multi-region sequencing through which multiple samples obtained from different geographical regions of a tumor are analyzed 25, 26, 27, 28. Although multi-region sequencing can reveal geographically segregated subpopulations, resolving spatially intermixed subclones remains difficult and this approach still relies on deconvolution of subclones for phylogeny inference [29].

The emergence of single-cell DNA sequencing (SCS) technologies has enabled sequencing of individual cancer cells, providing the highest-resolution of the mutational histories of cancer 23, 30. SCS aims to further our knowledge of different aspects of cancer biology including resolving clonal substructure, tracing tumor evolution, identifying rare subclones and understanding the role of cancer microenvironment in tumor progression 23, 24, 31. In this review, we discuss the state of the art of SCS technologies, technical challenges and computational approaches to overcome those, and finally, approaches for understanding ITH and tumor evolution from SCS data.

Section snippets

An overview of single-cell DNA sequencing methods

Figure 1 illustrates the steps of a single-cell DNA sequencing study. The first step in producing high-quality SCS data is the isolation of individual cells. Early experiments used techniques such as serial [32] or microwell dilution [33], micropipetting [34], laser-capture microdissection (LCM) [35] to isolate cells from a solid tissue. Several methods 36, 37 opted for isolation of single nuclei that remain intact in frozen samples. Later, flow-assisted cell sorting (FACS) 38, 39 and

Single-cell sequencing errors

Different technical artifacts introduced during the single-cell DNA sequencing workflow may introduce noise into the datasets, confounding bioinformatics analysis (Figure 2). Inadvertent isolation of DNA from multiple cells violates the basic assumption of the methods designed for analyzing single-cell data resulting in spurious biological conclusions [72]. Specifically, presence of ‘cell doublets’ is a persisting error (ranging from 1% 38, 42, 56 to 10% 60, 61, 73), in which more than one cell

Variant calling from single cells

Detection of copy number variants from SCS data commonly involves a variable binning method where the genome is divided into bins and the read count in each bin represents whether the region is over- or under-represented compared to a diploid genome 38, 56. Loess normalization is applied for correcting bias due to GC content and circular binary segmentation (CBS) [76] is used to segment the copy number profiles. Specific algorithms account for technical artifacts introduced by WGA 77, 78. A

Subclonal reconstruction from single cells

Variants detected from single cells are used to infer clonal subpopulations. Dimensionality reduction techniques such as PCA [89] and multidimensional scaling [90] have been used to infer monoclonality [60] or polyclonality [63] of a tumor. Hierarchical clustering has been applied on CNV profiles ••70, ••91 as well as SNV profiles 62, 63, 86, 92 to uncover the clonal composition in a tumor. Failure to account for errors in variant calling can result in spurious clustering. To overcome the

Reconstruction of phylogeny from single cells

One of the major applications of SCS is to study tumor evolution via the inference of phylogeny, a binary genealogical tree along which the tumor cells evolve. Even though concepts borrowed from population genetics such as selection and fitness are useful in the context of tumor evolution [96], many concepts (e.g., meiotic recombination, sexual selection) do not apply to tumors [4]. The presence of technical artifacts further inhibits a straightforward use of classical phylogeny inference

Conclusion & future directions

In conclusion, single-cell genomics is a promising new method that can improve many facets of cancer research, by illuminating tumor initiation, metastasis and therapy resistance. In the clinic, these tools are likely to have important applications in early detection, non-invasive monitoring and personalized therapy. However, significant challenges still remain and will need to be overcome before clinical applications can truly be realized. Even though the error rates of SCS datasets have

Funding

The study was supported by the National Cancer Institute (grant R01 CA172652 to KC), the NCI-Designated cancer center support grant to MD Anderson cancer center (P30 CA016672), and the Andrew Sabin Family Foundation. This work was supported by grants to NN from NCI (1RO1CA169244-01) and the Chan-Zuckerberg Foundation (HCA-A-1704-01668).

References (131)

  • Y. Li et al.

    Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer

    Gigascience

    (2012)
  • Y. Wang et al.

    Clonal evolution in breast cancer revealed by single nucleus genome sequencing

    Nature

    (2014)
  • X. Dong et al.

    Accurate identification of single-nucleotide variants in whole-genome-amplified single cells

    Nat Methods

    (2017)
  • C. Zhang et al.

    A single cell level based method for copy number variation analysis by low coverage massively parallel sequencing

    PLoS One

    (2013)
  • G. Ha et al.

    Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer

    Genome Res

    (2012)
  • M.R. Stratton et al.

    The cancer genome

    Nature

    (2009)
  • B. Vogelstein et al.

    Cancer genome landscapes

    Science

    (2013)
  • P.C. Nowell

    The clonal evolution of tumor cell populations

    Science

    (1976)
  • L.M.F. Merlo et al.

    Cancer as an evolutionary and ecological process

    Nat Rev Cancer

    (2006)
  • L.R. Yates et al.

    Evolution of the cancer genome

    Nat Rev Genet

    (2012)
  • M. Greaves et al.

    Clonal evolution in cancer

    Nature

    (2012)
  • R.J. Gillies et al.

    Evolutionary dynamics of carcinogenesis and why targeted therapy does not work

    Nat Rev Cancer

    (2012)
  • R.A. Burrell et al.

    The causes and consequences of genetic heterogeneity in cancer evolution

    Nature

    (2013)
  • R. McLendon et al.

    Comprehensive genomic characterization defines human glioblastoma genes and core pathways

    Nature

    (2008)
  • C. Kandoth et al.

    Mutational landscape and significance across 12 major cancer types

    Nature

    (2013)
  • K.S. Korolev et al.

    Turning ecology and evolution against cancer

    Nat Rev Cancer

    (2014)
  • S.P. Shah et al.

    The clonal and mutational evolution spectrum of primary triple-negative breast cancers

    Nature

    (2012)
  • D.A. Landau et al.

    Mutations driving CLL and their evolution in progression and relapse

    Nature

    (2015)
  • A. Roth et al.

    PyClone: statistical inference of clonal population structure in cancer

    Nat Methods

    (2014)
  • C.A. Miller et al.

    SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution

    PLoS Comput Biol

    (2014)
  • M. El-Kebir et al.

    Reconstruction of clonal trees and tumor composition from multi-sample sequencing data

    Bioinformatics

    (2015)
  • A.G. Deshwar et al.

    PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors

    Genome Biol

    (2015)
  • Y. Jiang et al.

    Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing

    Proc Natl Acad Sci

    (2016)
  • J.G. Reiter et al.

    Reconstructing metastatic seeding patterns of human cancers

    Nat Commun

    (2017)
  • N.E. Navin

    Cancer genomics: one cell at a time

    Genome Biol

    (2014)
  • M. Gerlinger et al.

    Intratumor heterogeneity and branched evolution revealed by multiregion sequencing

    N Engl J Med

    (2012)
  • M. Gerlinger et al.

    Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing

    Nat Genet

    (2014)
  • J. Zhang et al.

    Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing

    Science

    (2014)
  • L.R. Yates et al.

    Subclonal diversification of primary breast cancer revealed by multiregion sequencing

    Nat Med

    (2015)
  • T. Baslan et al.

    Unravelling biology and shifting paradigms in cancer with single-cell sequencing

    Nat Rev Cancer

    (2017)
  • R.G. Ham

    Clonal growth of mammalian cells in a chemically defined, synthetic medium

    Proc Natl Acad Sci U S A

    (1965)
  • J. Gole et al.

    Massively parallel polymerase cloning and genome sequencing of single cells using nanoliter microwells

    Nat Biotechnol

    (2013)
  • C. Zong et al.

    Genome-wide detection of single-nucleotide and copy-number variations of a single human cell

    Science

    (2012)
  • N. Nakamura et al.

    Laser capture microdissection for analysis of single cells

    Methods Mol Med

    (2007)
  • M.H. Tomasson

    Cancer stem cells: a guide for skeptics

    J Cell Biochem

    (2009)
  • M. Cristofanilli et al.

    Circulating tumor cells, disease progression, and survival in metastatic breast cancer

    N Engl J Med

    (2004)
  • N. Navin et al.

    Tumour evolution inferred by single-cell sequencing

    Nature

    (2011)
  • N.E. Potter et al.

    Single-Cell mutational profiling and clonal phylogeny in cancer

    Genome Res

    (2013)
  • C. Gawad et al.

    Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics

    Proc Natl Acad Sci U S A

    (2014)
  • T. Baslan et al.

    Optimizing sparse sequencing of single cells for highly multiplex copy number profiling

    Genome Res

    (2015)
  • Cited by (23)

    • Mechano-immunology in microgravity

      2023, Life Sciences in Space Research
    • HyperTraPS: Inferring Probabilistic Patterns of Trait Acquisition in Evolutionary and Disease Progression Pathways

      2020, Cell Systems
      Citation Excerpt :

      Recent methods for understanding feature relationships in single-cell data include SCITE (Jahn et al., 2016) and SiFit (Zafar et al., 2018), while methods for relating the samples phylogenetically in single-cell data and evaluating clonal clusters include OncoNEM (Ross and Markowetz, 2016). Zafar et al. (2018) discuss these methods in the context of single cell cancer observations. At the intermediate level of attempting to find common relationships in feature space across multiple cancer samples in different patients and different tissues, the recent Revolver platform attempts to provide a unifying interpretative approach via the method of transfer learning (Caravagna et al., 2018), and note that HyperTraPS could be readily applied to compilations of patient specific somatic trees too.

    • Inferring Markov Chains to Describe Convergent Tumor Evolution with CIMICE

      2024, IEEE/ACM Transactions on Computational Biology and Bioinformatics
    View all citing articles on Scopus
    View full text