Skip to main content

Platform-Independent Gene-Expression Based Classification-System for Molecular Sub-typing of Cancer

  • Chapter
  • First Online:
Personalized and Precision Medicine Informatics

Part of the book series: Health Informatics ((HI))

  • 827 Accesses

Abstract

Molecular stratification of cancer patients is driving the development of precision medicine-targeted therapies. Clustering of tumor samples based on gene expression profiles from high-throughput platforms, such as microarrays or NextGen sequencing, has resulted in distinct tumor subtypes for numerous cancers. However, the majority of the derived classifiers or gene signatures have not reached clinical utility. Therefore, informatics methods to accurately translate the derived gene-signature from the high-throughput platform to a clinically adaptable low-dimensional platform are critical. In this chapter, we discuss a workflow to derive and then transfer gene signatures from one analytical platform to another for cancer patient stratification. We summarize the results of the workflow on two different cancers. Finally we discuss the importance of data-discretization in dealing with the cross-platform data and incorporating the splice-variant or isoform-level gene expression profiles in the statistical analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Steinberg M. Dasatinib: a tyrosine kinase inhibitor for the treatment of chronic myelogenous leukemia and philadelphia chromosome-positive acute lymphoblastic leukemia. Clin Ther. 2007;29:2289–308. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18158072.

    Article  CAS  Google Scholar 

  2. Deremer DL, Ustun C, Natarajan K. Nilotinib: a second-generation tyrosine kinase inhibitor for the treatment of chronic myelogenous leukemia. Clin Ther. 2008;30:1956–75. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19108785.

    Article  CAS  Google Scholar 

  3. Saglio G, Baccarani M. First-line therapy for chronic myeloid leukemia: new horizons and an update. Clin Lymphoma Myeloma Leuk. 2010;10:169–76. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=20511160.

    Article  CAS  Google Scholar 

  4. Qin H, Chan MW, Liyanarachchi S, Balch C, Potter D, Souriraj IJ, et al. An integrative ChIP-chip and gene expression profiling to model SMAD regulatory modules. BMC Syst Biol. 2009;3:73. http://www.ncbi.nlm.nih.gov/pubmed/19615063.

    Article  Google Scholar 

  5. Vitucci M, Hayes DN, Miller CR. Gene expression profiling of gliomas: merging genomic and histopathological classification for personalised therapy. Br J Cancer. 2010;104:545–53. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=21119666.

    Article  Google Scholar 

  6. Yeghiazaryan K, Peeva V, Shenoy A, Schild HH, Golubnitschaja O. Chromium-picolinate therapy in diabetes care: molecular and subcellular profiling revealed a necessity for individual outcome prediction, personalised treatment algorithms and new guidelines. Infect Disord Drug Targets. 2011;11:188–95. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=21470100.

    Article  CAS  Google Scholar 

  7. Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, et al. What is a gene, post-ENCODE? History and updated definition. Genome Res. 2007;17:669–81. http://www.ncbi.nlm.nih.gov/pubmed/17567988.

    Article  CAS  Google Scholar 

  8. Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, et al. Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Res. 2011;21:1260–72. http://www.ncbi.nlm.nih.gov/pubmed/21712398.

    Article  CAS  Google Scholar 

  9. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413–5. http://www.ncbi.nlm.nih.gov/pubmed/18978789.

    Article  CAS  Google Scholar 

  10. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470–6. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18978772.

    Article  CAS  Google Scholar 

  11. Khoury MP, Bourdon JC. p53 isoforms: an intracellular microprocessor? Genes Cancer. 2011;2:453–65. http://www.ncbi.nlm.nih.gov/pubmed/21779513.

    Article  CAS  Google Scholar 

  12. Grabowski P. Alternative splicing takes shape during neuronal development. Curr Opin Genet Dev. 2011;21:388–94. http://www.ncbi.nlm.nih.gov/pubmed/21511457.

    Article  CAS  Google Scholar 

  13. Tazi J, Bakkour N, Stamm S. Alternative splicing and disease. Biochim Biophys Acta. 2009;1792:14–26. http://www.ncbi.nlm.nih.gov/pubmed/18992329.

    Article  CAS  Google Scholar 

  14. Botta A, Malena A, Tibaldi E, Rocchi L, Loro E, Pena E, et al. MBNL142 and MBNL143 gene isoforms, overexpressed in DM1-patient muscle, encode for nuclear proteins interacting with Src family kinases. Cell Death Dis. 2013;4:e770. http://www.ncbi.nlm.nih.gov/pubmed/23949219.

    Article  CAS  Google Scholar 

  15. Twine NA, Janitz K, Wilkins MR, Janitz M. Whole transcriptome sequencing reveals gene expression and splicing differences in brain regions affected by Alzheimer’s disease. PLoS One. 2011;6:e16266. http://www.ncbi.nlm.nih.gov/pubmed/21283692.

    Article  CAS  Google Scholar 

  16. Birzele F, Voss E, Nopora A, Honold K, Heil F, Lohmann S, et al. CD44 isoform status predicts response to treatment with anti-CD44 antibody in Cancer patients. Clin Cancer Res. 2015;21:2753–62. http://www.ncbi.nlm.nih.gov/pubmed/25762343.

    Article  CAS  Google Scholar 

  17. Zhang Y, Chen K, Sloan SA, Bennett ML, Scholze AR, O’Keeffe S, et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J Neurosci. 2014;34:11929–47. http://www.ncbi.nlm.nih.gov/pubmed/25186741.

    Article  CAS  Google Scholar 

  18. Pal S, Gupta R, Davuluri RV. Alternative transcription and alternative splicing in cancer. Pharmacol Ther. 2012;136:283–94. http://www.ncbi.nlm.nih.gov/pubmed/22909788.

    Article  CAS  Google Scholar 

  19. Lapuk A, Marr H, Jakkula L, Pedro H, Bhattacharya S, Purdom E, et al. Exon-level microarray analyses identify alternative splicing programs in breast cancer. Mol Cancer Res. 2010;8:961–74. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=20605923.

    Article  CAS  Google Scholar 

  20. Misquitta-Ali CM, Cheng E, O’Hanlon D, Liu N, McGlade CJ, Tsao MS, et al. Global profiling and molecular characterization of alternative splicing events misregulated in lung cancer. Mol Cell Biol. 2010;31:138–50. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=21041478.

    Article  Google Scholar 

  21. Ebert B, Bernard OA. Mutations in RNA splicing machinery in human cancers. N Engl J Med. 2011;365:2534–5. http://www.ncbi.nlm.nih.gov/pubmed/22150007.

    Article  CAS  Google Scholar 

  22. Venables JP, Klinck R, Koh C, Gervais-Bird J, Bramard A, Inkel L, et al. Cancer-associated regulation of alternative splicing. Nat Struct Mol Biol. 2009;16:670–6. http://www.ncbi.nlm.nih.gov/pubmed/19448617.

    Article  CAS  Google Scholar 

  23. Omenn GS, Yocum AK, Menon R. Alternative splice variants, a new class of protein cancer biomarker candidates: findings in pancreatic cancer and breast cancer with systems biology implications. Dis Markers. 2010;28:241–51. http://www.ncbi.nlm.nih.gov/pubmed/20534909.

    Article  CAS  Google Scholar 

  24. Skotheim RI, Nees M. Alternative splicing in cancer: noise, functional, or systematic? Int J Biochem Cell Biol. 2007;39:1432–49. http://www.ncbi.nlm.nih.gov/pubmed/17416541.

    Article  CAS  Google Scholar 

  25. Venables JP. Unbalanced alternative splicing and its significance in cancer. BioEssays. 2006;28:378–86. http://www.ncbi.nlm.nih.gov/pubmed/16547952.

    Article  CAS  Google Scholar 

  26. Zhang Z, Pal S, Bi Y, Tchou J, Davuluri R. Isoform-level expression profiles provide better cancer signatures than gene-level expression profiles. Genome Med. 2013;5:33. http://www.ncbi.nlm.nih.gov/pubmed/23594586.

    Article  CAS  Google Scholar 

  27. Pal S, Bi Y, Macyszyn L, Showe LC, O’Rourke DM, Davuluri RV. Isoform-level gene signature improves prognostic stratification and accurately classifies glioblastoma subtypes. Nucleic Acids Res. 2014;42:e64. http://www.ncbi.nlm.nih.gov/pubmed/24503249.

    Article  CAS  Google Scholar 

  28. Liu H, Hussain F, Tan CL, Dash M. Discretization: an enabling technique. Data Min Knowl Disc. 2002;6:393–423.

    Article  Google Scholar 

  29. Diaz-Uriarte R, Alvarez de Andres S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics. 2006;7:3. http://www.ncbi.nlm.nih.gov/pubmed/16398926.

    Article  Google Scholar 

  30. Shilpi A, Kandpal M, Ji Y, Seagle BL, Shahabi S, Davuluri RV. Platform-independent classification system for predicting high-grade serous ovarian carcinoma molecular subtypes. JCO Clin Cancer Inform. 2019;3:1–9.

    Google Scholar 

  31. Turro E, Lewin A, Rose A, Dallman MJ, Richardson S. MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays. Nucleic Acids Res. 2010;38:e4. https://www.ncbi.nlm.nih.gov/pubmed/19854940.

    Article  Google Scholar 

  32. Workman C, Jensen LJ, Jarmer H, Berka R, Gautier L, Nielser HB, et al. A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biol. 2002;3:research0048. http://www.ncbi.nlm.nih.gov/pubmed/12225587.

    Article  Google Scholar 

  33. Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12:323. http://www.ncbi.nlm.nih.gov/pubmed/21816040.

    Article  CAS  Google Scholar 

  34. Kim H, Bi Y, Davuluri RV. Estimating the expression of transcript isoforms from mRNA-Seq via nonnegative least squares. In: Proceedings of the 10th IEEE International Conference Bioinformatics and Bioengineering, Philadelphia, PA, USA; 2010. p. 296–7. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5521668.

  35. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34:525–7. https://www.ncbi.nlm.nih.gov/pubmed/27043002.

    Article  CAS  Google Scholar 

  36. Dapas M, Kandpal M, Bi Y, Davuluri RV. Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms. Br Bioinform. 2016;18:260–9. https://www.ncbi.nlm.nih.gov/pubmed/26944083.

    Google Scholar 

  37. Law CW, Chen Y, Shi W, Smyth GK. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15:R29. http://www.ncbi.nlm.nih.gov/pubmed/24485249.

    Article  Google Scholar 

  38. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26:1572–3. http://www.ncbi.nlm.nih.gov/pubmed/20427518.

    Article  CAS  Google Scholar 

  39. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput Appl Math. 1987;20:53–65.

    Article  Google Scholar 

  40. Jung S, Bi Y, Davuluri RV. Evaluation of data discretization methods to derive platform independent gene expression signatures for multi-class tumor subtyping. BMC Genomics. 2015;11(16 Suppl):S3.

    Article  Google Scholar 

  41. Diaz-Uriarte R. GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest. BMC Bioinformatics. 2007;8:328. http://www.ncbi.nlm.nih.gov/pubmed/17767709.

    Article  Google Scholar 

  42. Breiman L. Random forests. Mach Learn. 2001;45:5–32.

    Article  Google Scholar 

  43. Jung S, Bi Y, Davuluri RV. Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping. BMC Genomics. 2015;16(Suppl 1):S3. http://www.ncbi.nlm.nih.gov/pubmed/26576613.

    Article  Google Scholar 

  44. Liu Q, Sung AH, Chen Z, Liu J, Chen L, Qiao M, et al. Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics. 2011;12(Suppl 5):S1. http://www.ncbi.nlm.nih.gov/pubmed/22369383.

    Article  Google Scholar 

  45. Gupta R, Wikramasinghe P, Bhattacharyya A, Perez FA, Pal S, Davuluri RV. Annotation of gene promoters by integrative data-mining of ChIP-seq pol-II enrichment data. BMC Bioinformatics. 2010;11(Suppl 1):S65. http://www.ncbi.nlm.nih.gov/pubmed/20122241.

    Article  Google Scholar 

  46. Therneau TM, Grambsch PM. Modeling survival data: extending the cox model. New York, NY: Springer; 2000.

    Book  Google Scholar 

  47. Dunn GP, Rinne ML, Wykosky J, Genovese G, Quayle SN, Dunn IF, et al. Emerging insights into the molecular and cellular basis of glioblastoma. Genes Dev. 2012;26:756–84. http://www.ncbi.nlm.nih.gov/pubmed/22508724.

    Article  CAS  Google Scholar 

  48. Verhaak RG, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell. 2010;17:98–110. http://www.ncbi.nlm.nih.gov/pubmed/20129251.

    Article  CAS  Google Scholar 

  49. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin. 2016;66:7–30. https://www.ncbi.nlm.nih.gov/pubmed/26742998.

    Article  Google Scholar 

  50. Reid BM, Permuth JB, Sellers TA. Epidemiology of ovarian cancer: a review. Cancer Biol Med. 2017;14:9–32. https://www.ncbi.nlm.nih.gov/pubmed/28443200.

    Article  CAS  Google Scholar 

  51. Tothill RW, Tinker AV, George J, Brown R, Fox SB, Lade S, et al. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin Cancer Res. 2008;14:5198–208. https://www.ncbi.nlm.nih.gov/pubmed/18698038.

    Article  CAS  Google Scholar 

  52. Cancer Genome Atlas Research N. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609–15. https://www.ncbi.nlm.nih.gov/pubmed/21720365.

    Article  Google Scholar 

  53. Tanaka S, Louis DN, Curry WT, Batchelor TT, Dietrich J. Diagnostic and therapeutic avenues for glioblastoma: no longer a dead end? Nat Rev Clin Oncol. 2013;10:14–26. http://www.ncbi.nlm.nih.gov/pubmed/23183634.

    Article  CAS  Google Scholar 

  54. Huse JT, Holland E, DeAngelis LM. Glioblastoma: molecular analysis and clinical implications. Annu Rev Med. 2013;64:59–70. http://www.ncbi.nlm.nih.gov/pubmed/23043492.

    Article  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Library of Medicine of the National Institutes of Health [Award Number R01LM011297 to RD]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramana V. Davuluri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bi, Y., Davuluri, R.V. (2020). Platform-Independent Gene-Expression Based Classification-System for Molecular Sub-typing of Cancer. In: Adam, T., Aliferis, C. (eds) Personalized and Precision Medicine Informatics. Health Informatics. Springer, Cham. https://doi.org/10.1007/978-3-030-18626-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18626-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18625-8

  • Online ISBN: 978-3-030-18626-5

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics