Essential proteins discovery based on dominance relationship and neighborhood similarity centrality

Li, Gaoshi; Luo, Xinlong; Hu, Zhipeng; Wu, Jingli; Peng, Wei; Liu, Jiafei; Zhu, Xiaoshu

doi:10.1007/s13755-023-00252-9

Essential proteins discovery based on dominance relationship and neighborhood similarity centrality

Research
Published: 16 November 2023

Volume 11, article number 55, (2023)
Cite this article

Health Information Science and Systems Aims and scope Submit manuscript

Gaoshi Li^1,2,3,
Xinlong Luo ORCID: orcid.org/0009-0000-6966-871X^1,2,3,
Zhipeng Hu^1,2,3,
Jingli Wu^1,2,3,
Wei Peng⁴,
Jiafei Liu^1,2,3 &
…
Xiaoshu Zhu^1,2,3,5

167 Accesses
Explore all metrics

Abstract

Essential proteins play a vital role in development and reproduction of cells. The identification of essential proteins helps to understand the basic survival of cells. Due to time-consuming, costly and inefficient with biological experimental methods for discovering essential proteins, computational methods have gained increasing attention. In the initial stage, essential proteins are mainly identified by the centralities based on protein–protein interaction (PPI) networks, which limit their identification rate due to many false positives in PPI networks. In this study, a purified PPI network is firstly introduced to reduce the impact of false positives in the PPI network. Secondly, by analyzing the similarity relationship between a protein and its neighbors in the PPI network, a new centrality called neighborhood similarity centrality (NSC) is proposed. Thirdly, based on the subcellular localization and orthologous data, the protein subcellular localization score and ortholog score are calculated, respectively. Fourthly, by analyzing a large number of methods based on multi-feature fusion, it is found that there is a special relationship among features, which is called dominance relationship, then, a novel model based on dominance relationship is proposed. Finally, NSC, subcellular localization score, and ortholog score are fused by the dominance relationship model, and a new method called NSO is proposed. In order to verify the performance of NSO, the seven representative methods (ION, NCCO, E_POC, SON, JDC, PeC, WDC) are compared on yeast datasets. The experimental results show that the NSO method has higher identification rate than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identification of Essential Proteins by Using Complexes and Interaction Network

Predicting Essential Proteins Using a New Method

A New Scheme for Essential Proteins Identification in Dynamic Weighted Protein-Protein Interaction Networks

Data availability

Data are publicly available.

References

Fields S, Song O-K. A novel genetic system to detect protein–protein interactions. Nature. 1989;340(6230):245–6.
Article Google Scholar
Glass JI, Hutchison CA III, Smith HO, Venter JC. A systems biology tour de force for a near-minimal bacterium. Mol Syst Biol. 2009;5(1):330.
Article Google Scholar
Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, et al. Functional characterization of the S cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285(5429):901–6.
Article Google Scholar
Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 2009;37(suppl_1):D455–8.
Article Google Scholar
Clatworthy AE, Pierson E, Hung DT. Targeting virulence: a new paradigm for antimicrobial therapy. Nat Chem Biol. 2007;3(9):541–8.
Article Google Scholar
Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418(6896):387–91. https://doi.org/10.1038/nature00935.
Article Google Scholar
Roemer T, Jiang B, Davison J, Ketela T, Veillette K, Breton A, et al. Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery. Mol Microbiol. 2003;50(1):167–81.
Article Google Scholar
Cullen LM, Arndt GM. Genome-wide screening for gene function using RNAi in mammalian cells. Immunol Cell Biol. 2005;83(3):217–23.
Article Google Scholar
Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol. 2005;22(4):803–6. https://doi.org/10.1093/molbev/msi072.
Article Google Scholar
Joy MP, Brock A, Ingber DE, Huang S. High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol. 2005;2005(2):96–103. https://doi.org/10.1155/JBB.2005.96.
Article Google Scholar
Wuchty S, Stadler PF. Centers of complex networks. J Theor Biol. 2003;223(1):45–53. https://doi.org/10.1016/s0022-5193(03)00071-7.
Article MathSciNet Google Scholar
Estrada E, Rodriguez-Velazquez JA. Subgraph centrality in complex networks. Phys Rev E. 2005;71(5 Pt 2): 056103. https://doi.org/10.1103/PhysRevE.71.056103.
Article MathSciNet Google Scholar
Bonacich P. Power and centrality: a family of measures. Am J Sociol. 1987;92:12.
Article Google Scholar
Stephenson K, Zelen M. Rethinking centrality: methods and examples. Soc Netw. 1989;11(1):1–37.
Article MathSciNet Google Scholar
Wang J, Li M, Wang H, Pan Y. Bioinformatics. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol. 2011;9(4):1070–80.
Article Google Scholar
Acencio ML, Lemke N. Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 2009;10(1):290. https://doi.org/10.1186/1471-2105-10-290.
Article Google Scholar
Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. Evolutionary rate in the protein interaction network. Science. 2002;296(5568):750–2. https://doi.org/10.1126/science.1068696.
Article Google Scholar
Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002;12(6):962–8.
Article Google Scholar
Batada NN, Hurst LD, Tyers M. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol. 2006;2(7): e88. https://doi.org/10.1371/journal.pcbi.0020088.
Article Google Scholar
Sharp PM. Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J Mol Evol. 1991;33:23–33.
Article Google Scholar
Rocha EP, Danchin A. An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol. 2004;21(1):108–16. https://doi.org/10.1093/molbev/msh004.
Article Google Scholar
Wang J, Peng X, Li M, Pan Y. Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics. 2013;13(2):301–12. https://doi.org/10.1002/pmic.201200277.
Article Google Scholar
Xiao Q, Wang J, Peng X, Wu F-X, Pan Y. Identifying essential proteins from active PPI networks constructed with dynamic gene expression. BMC Genomics. 2015;16:1–7.
Article Google Scholar
Zhang Y, Lin H, Yang Z, Wang J. Construction of dynamic probabilistic protein interaction networks for protein complex identification. BMC Bioinform. 2016;17:1–13.
Article Google Scholar
Li M, Meng X, Zheng R, Wu FX, Li Y, Pan Y, et al. Identification of protein complexes by using a spatial and temporal active protein interaction network. IEEE/ACM Trans Comput Biol Bioinform. 2017;17:817–27.
Article Google Scholar
Tang X, Wang J, Zhong J, Pan Y. Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans Comput Biol Bioinform. 2013;11(2):407–18.
Article Google Scholar
Zhang X, Xiao W, Hu X. Predicting essential proteins by integrating orthology, gene expressions, and PPI networks. PLoS ONE. 2018;13(4): e0195410.
Article Google Scholar
Li G, Li M, Wang J, Wu J, Wu F-X, Pan Y. Predicting essential proteins based on subcellular localization, orthology and PPI networks. BMC Bioinform. 2016;17(8):571–81.
Google Scholar
Li M, Zhang H, Wang JX, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol. 2012;6:15. https://doi.org/10.1186/1752-0509-6-15.
Article Google Scholar
Zhong J, Tang C, Peng W, Xie M, Sun Y, Tang Q, et al. A novel essential protein identification method based on PPI networks and gene expression data. BMC Bioinform. 2021;22(1):248. https://doi.org/10.1186/s12859-021-04175-8.
Article Google Scholar
Zhang W, Xu J, Zou X. Predicting essential proteins by integrating network topology, subcellular localization information, gene expression profile and go annotation data. IEEE/ACM Trans Comput Biol Bioinform. 2019;17(6):2053–61.
Article Google Scholar
Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol. 2012;6(1):87. https://doi.org/10.1186/1752-0509-6-87.
Article Google Scholar
Zhang Z, Jiang M, Wu D, Zhang W, Yan W, Qu X. A novel method for identifying essential proteins based on non-negative matrix tri-factorization. Front Genet. 2021;12: 709660.
Article Google Scholar
Li G, Li M, Wang J, Li Y, Pan Y. United neighborhood closeness centrality and orthology for predicting essential proteins. IEEE/ACM Trans Comput Biol Bioinform. 2020;17(4):1451–8. https://doi.org/10.1109/TCBB.2018.2889978.
Article Google Scholar
Li G, Li M, Peng W, Li Y, Pan Y, Wang J. A novel extended Pareto optimality consensus model for predicting essential proteins. J Theor Biol. 2019;480:141–9.
Article Google Scholar
Xenarios I, Salwinski L, Duan XJ, Higney P, Kim SM, Eisenberg D. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002;30(1):303–5. https://doi.org/10.1093/nar/30.1.303.
Article Google Scholar
Yu H, Luscombe NM, Qian J, Gerstein M. Genomic analysis of gene expression relationships in transcriptional regulatory networks. Trends Genet. 2003;19(8):422–7. https://doi.org/10.1016/S0168-9525(03)00175-6.
Article Google Scholar
Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods. 2012;9(5):471–2. https://doi.org/10.1038/nmeth.1938.
Article Google Scholar
Mewes HW, Amid C, Arnold R, Frishman D, Guldener U, Mannhaupt G, et al. MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 2004;32:D41–4. https://doi.org/10.1093/nar/gkh092.
Article Google Scholar
Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, et al. SGD: Saccharomyces Genome Database. Nucleic Acids Res. 1998;26(1):73–9. https://doi.org/10.1093/nar/26.1.73.
Article Google Scholar
Saccharomyces Genome Deletion Project. http://www-sequence.stanford.edu/group/.
Tu BP, Kudlicki A, Rowicka M, McKnight SL. Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science. 2005;310(5751):1152–8. https://doi.org/10.1126/science.1120499.
Article Google Scholar
COMPARTMENTS. http://compartments.jensenlab.org. Accessed 28 Dec 2014.
Östlund G, Schmitt T, Forslund K, Köstler T, Messina DN, Roopra S, et al. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 2010;38:D196–203.
Article Google Scholar
Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523. https://doi.org/10.1038/s41467-019-09234-6.
Article Google Scholar

Download references

Funding

This research is supported by National Natural Science Foundation of China (Nos. 62141207, 62302107, 62366007, 61972185), Guangxi Natural Science Foundation (No. 2022GXNSFAA035625), Natural Science Foundation of Yunnan Province of China (No. 2019FA024), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (Nos. 20-A-01-03, 19-A-03-01), Guangxi Normal University Science Research Project (Natural Science) (No.2021JC008), Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Innovation Project of Guangxi Graduate Education (YCSW2023180).

Author information

Authors and Affiliations

Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004, China
Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Jiafei Liu & Xiaoshu Zhu
Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, 541004, Guangxi, China
Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Jiafei Liu & Xiaoshu Zhu
College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, Guangxi, China
Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Jiafei Liu & Xiaoshu Zhu
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, Yunnan, China
Wei Peng
School of Computer and Information Security & School of Software Engineering, Guilin University of Electronic Science and Technology, Guilin, China
Xiaoshu Zhu

Authors

Gaoshi Li
View author publications
You can also search for this author in PubMed Google Scholar
Xinlong Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jingli Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jiafei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoshu Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jingli Wu or Wei Peng.

Ethics declarations

Conflict of interest

The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

Not applicable.

Informed consent

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, G., Luo, X., Hu, Z. et al. Essential proteins discovery based on dominance relationship and neighborhood similarity centrality. Health Inf Sci Syst 11, 55 (2023). https://doi.org/10.1007/s13755-023-00252-9

Download citation

Received: 23 May 2023
Accepted: 13 October 2023
Published: 16 November 2023
DOI: https://doi.org/10.1007/s13755-023-00252-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Essential proteins discovery based on dominance relationship and neighborhood similarity centrality

Abstract

Access this article

Similar content being viewed by others

Identification of Essential Proteins by Using Complexes and Interaction Network

Predicting Essential Proteins Using a New Method

A New Scheme for Essential Proteins Identification in Dynamic Weighted Protein-Protein Interaction Networks

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Essential proteins discovery based on dominance relationship and neighborhood similarity centrality

Abstract

Access this article

Similar content being viewed by others

Identification of Essential Proteins by Using Complexes and Interaction Network

Predicting Essential Proteins Using a New Method

A New Scheme for Essential Proteins Identification in Dynamic Weighted Protein-Protein Interaction Networks

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation