A tumor gains its selective advantage from ‘driver’ mutations in genes involved in key pathways regulating cell identity, survival and genome stability. However, tumorigenic processes are mutagenic, and there may be many more mutated ‘passenger’ genes that confer no further advantage to the cell and expose the tumor to immune surveillance by the body. Some genes are recurrently mutated, whereas others are rarely mutated. The Pan-Cancer analysis group used comparisons across tumor types to refine its ability to discriminate driver mutations, enabling the improved delineation of the pathways through which driver mutations exert their effects on tumor initiation, proliferation and spread.
Main
Emerging landscape of oncogenic signatures across human cancers
Giovanni Ciriello et al.Nature Genetics 10.1038/ng.2762
The complex landscapes of somatic modifications observed in tumors are typically the result of a relatively small number of functional oncogenic alterations (sometimes called driver events), which are outnumbered by non-functional alterations (passenger events) that do not substantially contribute to oncogenesis and progression8. The low signal to noise ratio (ratio of the number of functional to non-functional events) presents a major challenge for data mining or data analysis.
Here we distilled thousands of genetic and epigenetic features altered in cancers to ∼500 selected functional events (SFEs). Using this simplified description, we derived a hierarchical classification of 3,299 TCGA tumors from 12 cancer types. The top classes are dominated by either mutations (M class) or copy number changes (C class). This distinction is clearest at the extremes of genomic instability, indicating the presence of different oncogenic processes.
At the top of this hierarchical classification, we identified two main tumor classes of similar size, each characterized by distinct sets of SFEs (Fig. 2a). Unexpectedly, although the distinction between copy number alterations and mutations was not used as a feature in our classification, these characteristic events were predominantly somatic mutations in one class and copy number alterations in the other (Fig. 2b). Closer inspection of the distribution of selected functional events showed a striking inverse relationship between copy number alterations and somatic mutations at the extremes of genomic instability, particularly in highly altered tumors (Fig. 2c).
Starting from this first major subdivision, we applied the network modularity algorithm recursively to the C class and M class tumors and to their subclasses. The result was hierarchical division into several levels of subclasses characterized by distinct patterns of functional alteration at each level of granularity (Fig. 3, Supplementary Fig. 5 and Supplementary Table 3).
Notably, TP53 mutations were an exception to this trend, as they were strongly enriched in the C class (q = 3 × 10−176), consistent with early mutations in TP53 causing copy number genomic instability ( Supplementary Fig. 1). This division into two main tumor classes indicates that recurrent copy number alterations and mutations are predominant in different subsets of tumors.
How somatic copy number alterations (SCNAs) affect cancer genes
Pan-cancer patterns of somatic copy number alteration
Travis Zack, Steven Schumacher et al.Nature Genetics 10.1038/ng.2760
Determining how somatic copy number alterations (SCNAs) promote cancer is an important goal. We characterized SCNA patterns in 4,934 cancers from The Cancer Genome Atlas Pan-Cancer data set. Whole-genome doubling, observed in 37% of cancers, was associated with higher rates of every other type of SCNA, TP53 mutations, CCNE1 amplifications and alterations of the PPP2R complex. SCNAs that were internal to chromosomes tended to be shorter than telomere-bounded SCNAs, suggesting different mechanisms underlying their generation. Significantly recurrent focal SCNAs were observed in 140 regions, including 102 without known oncogene or tumor suppressor gene targets and 50 with significantly mutated genes.
Tissue types from similar lineages tended to have similar rates of amplification and deletion in peak SCNA regions (Fig. 3a). We observed clusters of squamous cell carcinomas (head and neck squamous cell carcinoma, lung squamous cell carcinoma and bladder cancer) and reproductive cancers (ovarian and endometrial cancer) with breast cancer.
The features most associated with genes in the amplification and deletion peak regions are known to be associated with cancer (Fig. 3b). We applied GRAIL37, which uses literature citations, to find common features of genes in selected regions of the genome.
Robust methodologies for driver detection
Mutational heterogeneity in cancer and the search for new cancer-associated genes
Michael Lawrence, Petar Stojanov, Paz Polak et al.Nature 10.1038/nature12213
Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. […] By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.
We analysed heterogeneity across patients with a given cancer type. Analysis of the 27 cancer types revealed that the median frequency of non-synonymous mutations varied by more than 1,000-fold across cancer types (Fig. 1).
Significantly mutated genes (SMGs)
Mutational landscape and significance across 12 major cancer types
Cyriac Kandoth, Michael McLellan et al.Nature 10.1038/nature12634
In order to identify any genes that show positive selection in individual tumor types and across the 12 tumor types, the MuSiC-SMG test was performed to find those genes displaying significantly higher mutation frequencies than background. Our systematic analysis guided by gene expression data and manual curation (see Methods) discovered 127 significantly mutated genes (SMGs, Supplementary Table 4). Notably, 3,053 out of 3,281 total samples (93%) across the 12 Pan-Cancer types had at least one non-synonymous mutation in at least one of these 127 SMGs. These SMGs are involved in a wide range of cellular processes and can be broadly classified into 20 categories (Fig. 2). The top categories include transcription factors/regulators (21 genes), histone modifiers (13 genes), genome integrity (13 genes), RTK signaling (9 genes), cell cycle (7 genes), MAPK signaling (7 genes), PI3K signaling (6 genes), Wnt/β-catenin signaling (58 genes), histone (3 genes), ubiquitin mediated proteolysis (3 genes), and splicing (3 genes) (Fig. 2).
Combining multiple signals of positive selection to identify cancer drivers
Comprehensive identification of mutational cancer driver genes across 12 tumor types
David Tamborero, Abel Gonzalez-Perez et al.Scientific Reports 10.1038/srep02650
Driver genes can be identified by detecting signals of positive selection in their mutational pattern across tumors. High frequency of mutations is the most intuitive of these signals (detected by MutSigCV and MuSiC). Other complementary signals include: functional impact bias (OncodriveFM), clustering of mutations (OncodriveCLUST) and overrepresentation of mutations in phosphorylation sites (ActiveDriver) (Fig. 1a). Here we show that the combination of complementary methods allows identifying a comprehensive and reliable list of cancer driver genes. We describe the analysis of somatic mutations obtained via exome sequencing of 3,205 tumors from 12 tumor types by the Cancer Genome Atlas (TCGA) research network using these five complementary approaches. We combined the lists of driver candidates identified by these five methods both across the whole Pan-Cancer dataset and in each individual tumor type using a rule-based approach. This analysis results in the detection of 291 high-confidence mutational cancer driver genes (HCD) acting in these tumors (Fig. 1b). Among those genes, some have not been previously identified as cancer drivers and 16 have clear preference to sustain mutations in one specific tumor type.
One hundred and sixty-five of these candidates are novel findings not included in the CGC.
Thirteen selected non-CGC, or novel cancer genes are depicted in Figure 4 within their functional interaction context. These novel driver candidates appear alongside other well-established cancer genes.
A resource to explore cancer drivers across tumor types
IntOGen-mutations identifies cancer drivers across tumor types
Abel Gonzalez-Perez et al.Nature Methods 10.1038/nmeth.2642
In addition to the data generated by the TCGA Research Network there are other initiatives focused on tumor genome resequencing, including projects within the International Cancer Genome Consortium and other independent projects.
The IntOGen-mutations platform (http://www.intogen.org/mutations/) summarizes somatic mutations, genes and pathways involved in tumorigenesis.
The IntOGen-mutations pipeline integrates the results of tumor genomes analyzed with different mutation-calling workflows and is scalable to hundreds of thousands of tumor genomes. It currently includes OncodriveFM7, a tool that detects genes that are significantly biased toward the accumulation of mutations with high functional impact (FM bias) without the need to estimate background mutation rate8, and OncodriveCLUST9, which picks up genes whose mutations tend to cluster in particular regions of the protein sequence with respect to synonymous mutations (CLUST bias) (Online Methods). Both tools detect signals of positive selection, which appear in genes whose mutations are selected during tumor development and are therefore likely drivers.
These scores are subsequently transformed (with transFIC14) to compensate for the differences in baseline tolerance among genes, and each mutation is classified into one of four broad groups of impact, ranging from “None” to “High,” according to its consequence type and its transFIC MutationAssessor score (Fig. 1a).
We have analyzed somatic mutations in 4,623 samples from 31 different projects covering 13 anatomical sites (mainly from the International Cancer Genome Consortium (ICGC)1 and the TCGA2) (Supplementary Tables 1–3).
A systematic analysis of sequenced tumor genomes permits a broad view of the impact of genes in tumorigenesis across cancer types ( Supplementary Fig. 2). For example, TP53, ARID1A, KRAS or PIK3CA are frequently mutated and identified as cancer drivers in most cancer sites. Other genes, such as VHL in kidney, MAPK3 and GATA3 in breast and STK11 in lung, seem to be primarily tumor-specific drivers.
The results of the pipeline are automatically loaded into a Web browser managed by the Onexus framework (Supplementary Fig. 1).
The results can be browsed through the Web (Supplementary Note 2) and with Gitools interactive heat maps15 (http://www.gitools.org/datasets/).
The pipeline may be downloaded and can also be run online on our servers. It can be used to identify drivers from newly sequenced cohorts of tumor samples (Supplementary Note 3) and to interpret the mutations observed in a tumor sample (Supplementary Note 4).
Drivers significantly mutated by common mutator processes
Evidence for APOBEC3B mutagenesis in multiple human cancers
Michael Burns, Nuri Temiz & Reuben Harris Nature Genetics 10.1038/ng.2701
Thousands of somatic mutations accrue in most human cancers, and their causes are largely unknown. We recently showed that the DNA cytidine deaminase APOBEC3B accounts for up to half of the mutational load in breast carcinomas expressing this enzyme. Here we address whether APOBEC3B is broadly responsible for mutagenesis in multiple tumor types. We analyzed gene expression data and mutation patterns, distributions and loads for 19 different cancer types, with over 4,800 exomes and 1,000,000 somatic mutations.
Taken together with the comprehensive analyses presented here of expression data (Fig. 1), CG base-pair mutation frequencies (Fig. 2), local cytosine mutation signatures (Fig. 3), overall mutation loads (Fig. 4) and kataegis (Fig. 4c and Table 1), all available data converge on the conclusion that APOBEC3B is a major source of mutation in multiple human cancers.
An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers
Steven Roberts et al.Nature Genetics 10.1038/ng.2702
Within breast cancer, the HER2-enriched subtype was clearly enriched for tumors with the APOBEC mutation pattern, suggesting that this type of mutagenesis is functionally linked with cancer development. The APOBEC mutation pattern also extended to cancer-associated genes, implying that ubiquitous APOBEC-mediated mutagenesis is carcinogenic.
APOBEC signature mutations occurred at a higher frequency among carcinogenic mutations in the group of samples with high APOBEC presence compared to samples in which the APOBEC mutation pattern was not detected (Fig. 5).
Author information
Authors and Affiliations
Corresponding authors
Supplementary information
Rights and permissions
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
About this article
Cite this article
Gonzalez-Perez, A., Tamborero, D., Lopez-Bigas, N. et al. Thread 1: Mutational drivers. Nat Genet (2013). https://doi.org/10.1038/ng.2786
Published:
DOI: https://doi.org/10.1038/ng.2786