Skip to main content

Advertisement

Log in

A Novel Method for Identifying the Potential Cancer Driver Genes Based on Molecular Data Integration

  • Original Article
  • Published:
Biochemical Genetics Aims and scope Submit manuscript

Abstract

The identification of the cancer driver genes is essential for personalized therapy. The mutation frequency of most driver genes is in the middle (2–20%) or even lower range, which makes it difficult to find the driver genes with low-frequency mutations. Other forms of genomic aberrations, such as copy number variations (CNVs) and epigenetic changes, may also reflect cancer progression. In this work, a method for identifying the potential cancer driver genes (iPDG) based on molecular data integration is proposed. DNA copy number variation, somatic mutation, and gene expression data of matched cancer samples are integrated. In combination with the method of iKEEG, the "key genes" of cancer are identified, and the change in their expression levels is used for auxiliary evaluation of whether the mutated genes are potential drivers. For a mutated gene, the concept of mutational effect is defined, which takes into account the effects of copy number variation, mutation gene itself, and its neighbor genes. The method mainly includes two steps: the first step is data preprocessing. First, DNA copy number variation and somatic mutation data are integrated. Then, the integrated data are mapped to a given interaction network, and the diffusion kernel is used to form the mutation effect matrix. The second step is to obtain the key genes by using the iKGGE method, and construct the connection matrix by means of the gene expression data of the key genes and mutation impact matrix of the mutated genes. Experiments on TCGA breast cancer and Glioblastoma multiforme datasets demonstrate that iPDG is effective not only to identify the known cancer driver genes but also to discover the rare potential driver genes. When measured by functional enrichment analysis, we find that these genes are clearly associated with these two types of cancers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR (2010) A method and server for predicting damaging missense mutations. Nat Methods 7(4):248–249

    CAS  PubMed  PubMed Central  Google Scholar 

  • Akavia UD, Litvin O, Kim J, Sanchez-Garcia F, Kotliar D, Causton HC, Pochanard P, Mozes E, Garraway LA, Pe'er D (2010) An integrated approach to uncover drivers of cancer. Cell 143(6):1005–1017

    CAS  PubMed  PubMed Central  Google Scholar 

  • Amgalan B, Lee H (2015) DEOD: uncovering dominant effects of cancer-driver genes based on a partial covariance selection method. Bioinformatics 31(15):2452–2460

    CAS  PubMed  Google Scholar 

  • An O, Dall'Olio GM, Mourikis TP, Ciccarelli FD (2016) NCG 5.0: updates of a manually curated repository of cancer genes and associated properties from cancer mutational screenings. Nucleic Acids Res 44(D1):D992–D999.

    PubMed  PubMed Central  Google Scholar 

  • Babaei S, Hulsman M, Reinders M, de Ridder J (2013) Detecting recurrent gene mutation in interaction network context using multi-scale graph diffusion. Bmc Bioinf 14:29.

    PubMed  PubMed Central  Google Scholar 

  • Bachman KE, Argani P, Samuels Y, Silliman N, Ptak J, Szabo S, Konishi H, Karakas B, Blair BG, Lin C et al (2004) The PIK3CA gene is mutated with high frequency in human breast cancers. Cancer Biol Ther 3(8):772–775

    CAS  PubMed  Google Scholar 

  • Banerji S, Cibulskis K, Rangel-Escareno C, Brown KK, Carter SL, Frederick AM, Lawrence MS, Sivachenko AY, Sougnez C, Zou LH et al (2012) Sequence analysis of mutations and translocations across breast cancer subtypes. Nature 486(7403):405–409

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bashashati A, Haffari G, Ding JR, Ha G, Lui K, Rosner J, Huntsman DG, Caldas C, Aparicio SA, Shah SP (2012) DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer. Genome Biol 13(12):R124.

    PubMed  PubMed Central  Google Scholar 

  • Bertrand D, Chng KR, Sherbaf FG, Kiesel A, Chia BKH, Sia YY, Huang SK, Hoon DSB, Liu ET, Hillmer A et al (2015) Patient-specific driver gene prediction and risk assessment through integrated network analysis of cancer omics profiles. Nucleic Acids Res 43(7):e44.

    PubMed  PubMed Central  Google Scholar 

  • Cervigne NK, Machado J, Goswami RS, Sadikovic B, Bradley G, Perez-Ordonez B, Galloni NN, Gilbert R, Gullane P, Irish JC et al (2014) Recurrent genomic alterations in sequential progressive leukoplakia and oral cancer: drivers of oral tumorigenesis? Hum Mol Genet 23(10):2618–2628

    CAS  PubMed  PubMed Central  Google Scholar 

  • Cheng FX, Zhao JF, Zhao ZM (2016) Advances in computational approaches for prioritizing driver mutations and significantly mutated genes in cancer genomes. Brief Bioinform 17(4):642–656

    CAS  PubMed  Google Scholar 

  • Chin L, Meyerson M, Aldape K, Bigner D, Mikkelsen T, VandenBerg S, Kahn A, Penny R, Ferguson ML, Gerhard DS et al (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455(7216):1061–1068

    Google Scholar 

  • Cho A, Shim JE, Kim E, Supek F, Lehner B, Lee I (2016) MUFFINN: cancer gene discovery via network analysis of somatic mutation data. Genome Biol 17:129.

  • Cizkova M, Vacher S, Meseure D, Trassard M, Susini A, Mlcuchova D, Callens C, Rouleau E, Spyratos F, Lidereau R, Bièche I (2013) PIK3R1 underexpression is an independent prognostic marker in breast cancer. BMC Cancer 13:545.

    PubMed  PubMed Central  Google Scholar 

  • Dees ND, Zhang QY, Kandoth C, Wendl MC, Schierding W, Koboldt DC, Mooney TB, Callaway MB, Dooling D, Mardis ER et al (2012) MuSiC: identifying mutational significance in cancer genomes. Genome Res 22(8):1589–1598

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ding PJ, Luo JW, Liang C, Xiao Q, Cao BW (2018) Human disease MiRNA inference by combining target information based on heterogeneous manifolds. J Biomed Inform 80:26–36

    PubMed  Google Scholar 

  • Estival A, Pineda E, Martinez-Garcia M, Marruecos J, Mesia C, Lucas A, Macia M, Gil M, Gallego O, Verger E et al (2016) MGMT methylated (Met) patients (p) with glioblastoma (GBM) have a better prognosis with an earlier response (ER) than those who have a late response or pseudoprogression (LR/PsP). Results of the Gliocat study. Ann Oncol 27:338.

    Google Scholar 

  • Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR (2004) A census of human cancer genes. Nat Rev Cancer 4(3):177–183

    CAS  PubMed  PubMed Central  Google Scholar 

  • Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Edkins S et al (2007) Patterns of somatic mutation in human cancer genomes. Nature 446(7132):153–158

    CAS  PubMed  PubMed Central  Google Scholar 

  • Haber DA, Settleman J (2007) Cancer—drivers and passengers. Nature 446(7132):145–146

    CAS  PubMed  Google Scholar 

  • Hofree M, Shen JP, Carter H, Gross A, Ideker T (2013) Network-based stratification of tumor mutations. Nat Methods 10(11):1108–1115

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hou JP, Ma J (2014) DawnRank: discovering personalized driver genes in cancer. Genome Med 6:56.

  • Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57

    CAS  Google Scholar 

  • Hudson TJ, Anderson W, Aretz A, Barker AD, Bell C, Bernabe RR, Bhan MK, Calvo F, Eerola I, Gerhard DS et al (2010) International network of cancer genome projects. Nature 464(7291):993–998

    CAS  PubMed  Google Scholar 

  • Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang ZM, Welch R, Hutchinson A et al (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39(7):870–874

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hyman E, Kauraniemi P, Hautaniemi S, Wolf M, Mousses S, Rozenblum E, Ringner M, Sauter G, Monni O, Elkahloun A et al (2002) Impact of DNA amplification on gene expression patterns in breast cancer. Can Res 62(21):6240–6245

    CAS  Google Scholar 

  • Inthal A, Zeitlhofer P, Zeginigg M, Morak M, Grausenburger R, Fronkova E, Fahrner B, Mann G, Haas OA, Panzer-Grümayer R (2012) CREBBP HAT domain mutations prevail in relapse cases of high hyperdiploid childhood acute lymphoblastic leukemia. Leukemia 26(8):1797–1803.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Jia PL, Zhao ZM (2014) VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data. PLoS Computl Biol 10(2):e1003460

    PubMed  PubMed Central  Google Scholar 

  • Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M (2016) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44(D1):D457–D462

    CAS  PubMed  Google Scholar 

  • Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kondor RI, Lafferty J (2002) Diffusion kernels on graphs and other discrete structures. In: Icml. pp 315–322.

  • Kumar R, Neilsen PM, Crawford J, McKirdy R, Lee J, Powell JA, Saif Z, Martin JM, Lombaerts M, Cornelisse CJ et al (2005) FBXO31 is the chromosome 16q24.3 senescence gene, a candidate breast tumor suppressor, and a component of an SCF complex. Cancer Res 65(24):11304–1313.

    CAS  PubMed  Google Scholar 

  • Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA et al (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499(7457):214–218

    CAS  PubMed  PubMed Central  Google Scholar 

  • Leiserson MD, Vandin F, Wu HT, Dobson JR, Raphael BR (2014) Pan-cancer identification of mutated pathways and protein complexes. Cancer Res 74(19):5324.

  • Liao B, Jiang Y, Liang W, Zhy W, Cai L, Cao Z (2014) Gene selection using locality sensitive laplacian score. IEEE/ACM Trans Comput Biol Bioinform 11(6):1146–1156.

    PubMed  Google Scholar 

  • Liu JL, Liu TJ, Aldape KD, Mao ZY, LaFortune TA, Yung WKA (2006) Nuclear PTEN as a potential therapeutic molecule in GBM. Neuro-Oncology 8(4):398–399

    Google Scholar 

  • Lu X, Li X, Liu P, Qian X, Miao Q, Peng S (2018) The integrative method based on the module-network for identifying driver genes in cancer subtypes. Molecules 23(2):183

    PubMed Central  Google Scholar 

  • Lu X, Qian X, Li X, Miao Q, Peng S (2019) DMCM: a data-adaptive mutation clustering method to identify cancer-related mutation clusters. Bioinformatics 35(3):389–397.

    CAS  PubMed  Google Scholar 

  • Mansour WY, Tennstedt P, Volquardsen J, Oing C, Kluth M, Hube-Magg C, Borgmann K, Simon R, Petersen C, Dikomey E et al (2018) Loss of PTEN-assisted G2/M checkpoint impedes homologous recombination repair and enhances radio-curability and PARP inhibitor treatment response in prostate cancer. Sci Rep 8:3947.

  • Mearini L (2017) Frequency and prognostic value of PTEN loss in patients with upper tract urothelial carcinoma treated with radical nephroureterectomy EDITORIAL COMMENT. J Urol 198(6):1277–1278

    Google Scholar 

  • Network CGAR (2012) Comprehensive genomic characterization of squamous cell lung cancers The Cancer Genome Atlas Research Network (vol 489, pg 519, 2012). Nature 491(7423):288–288

    Google Scholar 

  • Ng S, Collisson EA, Sokolov A, Goldstein T, Gonzalez-Perez A, Lopez-Bigas N, Benz C, Haussler D, Stuart JM (2012) PARADIGM-SHIFT predicts the function of mutations in multiple cancers using pathway impact analysis. Bioinformatics 28(18):I640–I646

    CAS  PubMed  PubMed Central  Google Scholar 

  • Page K, Wiszniewska J, Basehore M, Watral M, Eng C, Gururangan S (2007) Rhabdomyosarcoma (RMS) of extremity and cerebral glioblastoma multiforme (GBM) in a child with Li-fraumeni syndrome and germline TP53 splice mutation. Neuro-Oncology 9(4):544–544

    Google Scholar 

  • Pirooznia M, Goes FS, Zandi PP (2015) Whole-genome CNV analysis: advances in computational approaches. Front Genet 6:138.

  • Qiao N, Huang Y, Naveed H, Green CD, Han JDJ (2013) CoCiter: an efficient tool to infer gene function by assessing the significance of literature co-citation. PloS ONE 8(9):e74074.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ramadoss A, Leu S, Ritz MF, Schaefer T, Tintignac L, Tostado C, Frank S, Mariani L, Boulay JL (2016) Act locally: the 3q26 genes SOX2, PIK3CA, MFN1 and OPA1 co-regulate GBM cell invasion. Neuro-Oncology 18:74–74

    Google Scholar 

  • Raphael BJ, Dobson JR, Oesper L, Vandin F (2014) Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Genome Med 6:5

    PubMed  PubMed Central  Google Scholar 

  • Rozenchan PB, Mundim FG, Roela RA, Katayama ML, Pasini FS, Brentani H, Lyra EC, Folgueira MAAK, Brentani MM (2014) RHOA, RAC1 and PAK1 evaluation in paired stromal fibroblasts of breast cancer primary and of lymph node metastasis: Importance of these biomarkers in lymph node invasion. Cancer Res 74(19).

  • Santra MK, Wajapeyee N, Green MR (2009) F-box protein FBXO31 mediates cyclin D1 degradation to induce G1 arrest after DNA damage. Nature 459(7247):722–725.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Shi K, Gao L, Wang BB (2016) Discovering potential cancer driver genes by an integrated network-based approach. Mol BioSyst 12(9):2921–2931

    CAS  PubMed  Google Scholar 

  • Stratton MR, Campbell PJ, Futreal PA (2009) The cancer genome. Nature 458(7239):719–724

    CAS  PubMed  PubMed Central  Google Scholar 

  • Suo C, Hrydziuszko O, Lee D, Pramana S, Saputra D, Joshi H, Calza S, Pawitan Y (2015) Integration of somatic mutation, expression and functional data reveals potential driver genes predictive of breast cancer survival. Bioinformatics 31(16):2607–2613

    CAS  PubMed  Google Scholar 

  • Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):561–568.

    PubMed  PubMed Central  Google Scholar 

  • Vandin F, Upfal E, Raphael BJ (2011) Algorithms for detecting significantly mutated pathways in cancer. J Comput Biol 18(3):507–522

    CAS  PubMed  Google Scholar 

  • Vogelstein B, Papadopoulos N, Velculescu VE, Zhou SB, Diaz LA, Kinzler KW (2013) Cancer genome landscapes. Science 339(6127):1546–1558

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wei PJ, Zhang D, Xia JF, Zheng CH (2016) LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network. Bmc Bioinf 2016, 17:467.

  • Wu LL, Wang YZ, Liu Y, Yu SY, Xie H, Shi XJ, Qin S, Ma F, Tan TZ, Thiery JP et al (2014) A central role for TRPS1 in the control of cell cycle and cancer development. Oncotarget 5(17):7677–7690

    PubMed  PubMed Central  Google Scholar 

  • Xi JN, Wang MH, Li A (2017) Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information. Mol BioSyst 13(10):2135–2144

    CAS  PubMed  Google Scholar 

  • Xiao Q, Luo JW, Liang C, Cai J, Ding PJ (2018) A graph regularized non-negative matrix factorization method for identifying microRNA-disease associations. Bioinformatics 34(2):239–248

    CAS  PubMed  Google Scholar 

  • Yi SH, Park JHY (2004) Down-regulation of ErbB2 and ErbB3 levels by curcumin in MCF-7 human breast cancer cells. Faseb J 18(4):A126–A126

    Google Scholar 

  • Youn A, Simon R (2011) Identifying cancer driver genes in tumor genome sequencing studies. Bioinformatics 27(2):175–181

    CAS  PubMed  Google Scholar 

  • Zhang W, Wang S (2017) An integrated framework for identifying mutated driver pathway and cancer progression. IEEE/ACM Trans Comput Biol Bioinf 1–1.

  • Zhang W, Wang SL (2018) An efficient strategy for identifying cancer-related key genes based on graph entropy. Comput Biol Chem 74:142–148

    CAS  PubMed  Google Scholar 

  • Zhao JF, Zhang SH, Wu LY, Zhang XS (2012) Efficient methods for identifying mutated driver pathways in cancer. Bioinformatics 28(22):2940–2947

    CAS  PubMed  Google Scholar 

  • Zheng CH, Zhang L, Ng VTY, Shiu SCK, Huang DS (2011) Molecular pattern discovery based on penalized matrix decomposition. Ieee Acm T Comput Bi 8(6):1592–1603

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant Nos. 61672011, 61472467 and 61471169), and the Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shu-Lin Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Wang, SL. A Novel Method for Identifying the Potential Cancer Driver Genes Based on Molecular Data Integration. Biochem Genet 58, 16–39 (2020). https://doi.org/10.1007/s10528-019-09924-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10528-019-09924-2

Keywords

Navigation