Abstract
In recent years, transcriptome sequencing has become very popular, encompassing a wide variety of applications from simple mRNA profiling to discovery and analysis of the entire transcriptome. One of the most common aims of transcriptome sequencing is to identify genes that are differentially expressed (DE) between two or more biological conditions, and to infer associated pathways and gene networks from expression profiles. It can provide avenues for further systematic investigation into potential biologic mechanisms. Gene Set (GS) enrichment analysis is a popular approach to identify pathways or sets of genes that are significantly enriched in the context of differentially expressed genes. However, the approach considers a pathway as a simple gene collection disregarding knowledge of gene or protein interactions. In contrast, topology-based methods integrate the topological structure of a pathway and gene network into the analysis. To provide a panoramic view of such approaches, this chapter demonstrates several recent computational workflows, including gene set enrichment and topology-based methods, for analysis of the DE pathways and gene networks from transcriptome-wide sequencing data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A (2016) A survey of best practices for RNA-Seq data analysis. Genome Biol 17:13. https://doi.org/10.1186/s13059-016-0881-8
Bayerlova M, Jung K, Kramer F, Klemm F, Bleckmann A, Beissbarth T (2015) Comparative study on gene set and pathway topology-based enrichment methods. BMC Bioinformatics 16:334. https://doi.org/10.1186/s12859-015-0751-5
Jaakkola MK, Elo LL (2016) Empirical comparison of structure-based pathway methods. Brief Bioinform 17(2):336–345. https://doi.org/10.1093/bib/bbv049
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102
Nam D, Kim S-Y (2008) Gene-set approach for expression pattern analysis. Brief Bioinform 9(3):189–197
Huang d W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57. https://doi.org/10.1038/nprot.2008.211
Barry WT, Nobel AB, Wright FA (2005) Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics 21(9):1943–1949. https://doi.org/10.1093/bioinformatics/bti260
Beissbarth T, Speed TP (2004) GOstat: find statistically overrepresented gene ontologies within a group of genes. Bioinformatics 20(9):1464–1465. https://doi.org/10.1093/bioinformatics/bth088
Team RC (2014) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing 14(3):279-293.
Charmpi K, Ycart B (2015) Weighted Kolmogorov Smirnov testing: an alternative for gene set enrichment analysis. Stat Appl Genet Mol Biol 14. https://doi.org/10.1515/sagmb-2014-0077
Fontoura CARS, Castellani G, Mombach JCM (2016) The R implementation of the CRAN package PATHChange, a tool to study genetic pathway alterations in transcriptomic data. Comput Biol Med 78:76–80. https://doi.org/10.1016/j.compbiomed.2016.09.010
Draghici S, Khatri P, Tarca AL, Amin K, Done A, Voichita C, Georgescu C, Romero R (2007) A systems biology approach for pathway level analysis. Genome Res 17(10):1537–1545. https://doi.org/10.1101/gr.6202607
Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, Voichiţa C, Drăghici S (2013) Methods and approaches in the topology-based analysis of biological pathways. Front Physiol 4:278. https://doi.org/10.3389/fphys.2013.00278
Ahsan S, Draghici S (2017) Identifying significantly impacted pathways and putative mechanisms with iPathwayGuide. Curr Protoc Bioinformatics 57:7.15.11–17.15.30. https://doi.org/10.1002/cpbi.24
Ibrahim M, Jassim S, Cawthorne MA, Langlands K (2014) A MATLAB tool for pathway enrichment using a topology-based pathway regulation score. BMC Bioinformatics 15:358. https://doi.org/10.1186/s12859-014-0358-2
Wadi L, Meyer M, Weiser J, Stein LD, Reimand J (2016) Impact of outdated gene annotations on pathway enrichment analysis. Nat Methods 13(9):705–706. https://doi.org/10.1038/nmeth.3963
Dona MSI, Prendergast LA, Mathivanan S, Keerthikumar S, Salim A (2017) Powerful differential expression analysis incorporating network topology for next-generation sequencing data. Bioinformatics 33(10):1505–1513. https://doi.org/10.1093/bioinformatics/btw833
Jacob L, Neuvial P, Dudoit S (2010) Gains in power from structured two-sample tests of means on graphs. arXiv preprint arXiv:10095173
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C (2013) Along signal paths: an empirical gene set approach exploiting pathway topology. Nucleic Acids Res 41(1):e19–e19. https://doi.org/10.1093/nar/gks866
Massa MS, Chiogna M, Romualdi C (2010) Gene set analysis exploiting the topology of a pathway. BMC Syst Biol 4:121. https://doi.org/10.1186/1752-0509-4-121
Sales G, Calura E, Cavalieri D, Romualdi C (2012) graphite - a Bioconductor package to convert pathway topology to gene network. BMC Bioinformatics 13:20–20. https://doi.org/10.1186/1471-2105-13-20
Clough E, Barrett T (2016) The Gene Expression Omnibus database. Methods Mol Biol 1418:93–110. https://doi.org/10.1007/978-1-4939-3578-9_5
Davis S, Meltzer PS (2007) GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23:1846–1847.
Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C (2011) Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res 39(Database):D685–D690. https://doi.org/10.1093/nar/gkq1039
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45(D1):D353–D361. https://doi.org/10.1093/nar/gkw1092
Sidiropoulos K, Viteri G, Sevilla C, Jupe S, Webber M, Orlic-Milacic M, Jassal B, May B, Shamovsky V, Duenas C (2017) Reactome enhanced pathway visualization. Bioinformatics 33(21):3461–3467
Luna A, Babur O, Aksoy BA, Demir E, Sander C (2016) PaxtoolsR: pathway analysis in R using Pathway Commons. Bioinformatics 32(8):1262–1264. https://doi.org/10.1093/bioinformatics/btv733
Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, Tamayo P (2015) The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 1(6):417–425. https://doi.org/10.1016/j.cels.2015.12.004
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011) Molecular signatures database (MSigDB) 3.0. Bioinformatics 27(12):1739–1740. https://doi.org/10.1093/bioinformatics/btr260
Lu TP, Tsai MH, Lee JM, Hsu CP, Chen PC, Lin CW, Shih JY, Yang PC, Hsiao CK, Lai LC, Chuang EY (2010) Identification of a novel biomarker, SEMA5A, for non-small cell lung carcinoma in nonsmoking women. Cancer Epidemiol Biomarkers Prevent 19(10):2590–2597. https://doi.org/10.1158/1055-9965.epi-10-0332
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M (2012) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41(D1):D991–D995
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10):R106. https://doi.org/10.1186/gb-2010-11-10-r106
Leng N, Dawson JA, Thomson JA, Ruotti V, Rissman AI, Smits BMG, Haag JD, Gould MN, Stewart RM, Kendziorski C (2013) EBSeq: an empirical Bayes hierarchical model for inference in RNA-Seq experiments. Bioinformatics 29(8):1035–1043. https://doi.org/10.1093/bioinformatics/btt087
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. https://doi.org/10.1093/bioinformatics/btp616
Tarazona S, García F, Ferrer A, Dopazo J, Conesa A (2012) NOIseq: a RNA-Seq differential expression method robust for sequencing depth biases. EMBnet J 17(B):18–19
Kim SK, Kim SY, Kim JH, Roh SA, Cho DH, Kim YS, Kim JC (2014) A nineteen gene-based risk score classifier predicts prognosis of colorectal cancer patients. Mol Oncol 8(8):1653–1666. https://doi.org/10.1016/j.molonc.2014.06.016
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M (2015) The BioGRID interaction database: 2015 update. Nucleic Acids Res 43(Database issue):D470–D478. https://doi.org/10.1093/nar/gku1204
Sales G, Calura E, Romualdi C (2012) GRAPH interaction from pathway topological environment BMC Bioinformatics 2013
Caspi R, Altman T, Billington R, Dreher K, Foerster H, Fulcher CA, Holland TA, Keseler IM, Kothari A, Kubo A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Subhraveti P, Weaver DS, Weerasinghe D, Zhang P, Karp PD (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42(Database issue):D459–D471. https://doi.org/10.1093/nar/gkt1103
Mi H, Muruganujan A, Thomas PD (2013) PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res 41(Database issue):D377–D386. https://doi.org/10.1093/nar/gks1118
Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37(Database issue):D674–D679. https://doi.org/10.1093/nar/gkn653
Gray KA, Yates B, Seal RL, Wright MW, Bruford EA (2015) Genenames.org: the HGNC resources in 2015. Nucleic Acids Res 43(Database issue):D1079–D1085. https://doi.org/10.1093/nar/gku1071
Maglott D, Ostell J, Pruitt KD, Tatusova T (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 33(Database issue):D54–D58. https://doi.org/10.1093/nar/gki031
Knijnenburg TA, Wessels LFA, Reinders MJT, Shmulevich I (2009) Fewer permutations, more accurate P-values. Bioinformatics 25(12):i161–i168. https://doi.org/10.1093/bioinformatics/btp211
Acknowledgments
This work was supported by the Fundamental Research Funds for the Central Universities (Grant No. JZ2017YYPY0899). The authors are grateful to the editors and the anonymous reviewers for their valuable suggestions and comments facilitating the improvement of this chapter.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Huang, Q., Sun, Ma., Yan, P. (2018). Pathway and Network Analysis of Differentially Expressed Genes in Transcriptomes. In: Wang, Y., Sun, Ma. (eds) Transcriptome Data Analysis. Methods in Molecular Biology, vol 1751. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7710-9_3
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7710-9_3
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7709-3
Online ISBN: 978-1-4939-7710-9
eBook Packages: Springer Protocols