Abstract
Newly synthesized proteins in eukaryotic cells can only function well after they are accurately transported to specific organelles. The establishment of protein databases and the development of programs have accelerated the study of protein subcellular locations, but their comparisons and evaluations of the prediction accuracy of subcellular location programs in plants are lacking. In this study, we built a random test set of maize proteins to evaluate the accuracy of six commonly used programs of subcellular locations: iLoc-Plant, Plant-mPLoc, CELLO, WoLF PSORT, SherLoc2, and Predotar. Our results showed that the accuracy of prediction varied greatly depending on the programs and subcellular locations involved. The programs using homology search methods (iLoc-Plant and Plant-mPLoc) performed better than those using feature search methods (CELLO, WoLF PSORT, SherLoc2, and Predotar). In particular, iLoc-Plant achieved an 84.9 % accuracy for proteins whose subcellular locations have been experimentally determined and a 74.3 % accuracy for all of the proteins in the test set. Regarding locations, the highest prediction accuracies for subcellular locations were obtained for the nucleus, followed by the cytoplasm, mitochondria, plastids, endoplasmic reticulum, and vacuoles, while the lowest were obtained for cell membrane, secreted, and multiple-location proteins. We discussed the accuracy of the six programs in this article. This study will assist plant biologists in choosing appropriate programs to predict the location of proteins and provide clues regarding their function, especially for hypothetical or novel proteins.
Similar content being viewed by others
References
Bauer J, Chen K, Hiltbunner A, Wehrli E, Eugster M, Schnell D, Kessler F (2000) The major protein import receptor of plastids is essential for chloroplast biogenesis. Nature 403:203–207
Bina JE, Nano F, Hancock RE (1997) Utilization of alkaline phosphatase fusions to identify secreted proteins, including potential efflux proteins and virulence factors from Helicobacter pylori. FEMS Microbiol Lett 148:63–68
Borer RA, Lehner CF, Eppenberger HM, Nigg EA (1989) Major nucleolar proteins shuttle between nucleus and cytoplasm. Cell 56:379–390
Boulikas T (1993) Nuclear locations signals (NLS). Crit Rev EGE 3:193–227
Briesemeister S, Blum T, Brady S, Lam Y, Kohlbacher O, Shatkay H (2009) SherLoc2: a high-accuracy hybrid method for predicting subcellular localization of proteins. J Proteome Res 8:5363–5366
Bunkelmann JR, Trelease RN (1996) Ascorbate peroxidase. A prominent membrane protein in oilseed glyoxysomes. Plant Physiol 110:589–598
Carter C, Pan S, Zouhar J, Avila EL, Girke T, Raikhel NV (2004) The vegetative vacuole proteome of Arabidopsis thaliana reveals predicted and unexpected proteins. Plant Cell 16:3285–3303
Cedano J, Aloy P, Pérez-Pons JA, Querol E (1997) Relation between amino acid composition and cellular location of proteins. J Mol Biol 266:594–600
Chou KC (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol Biosyst 9:1092–1100
Chou KC (2015) Impacts of bioinformatics to medicinal chemistry. Med Chem 11:218–234
Chou KC, Shen HB (2010) A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0. PLoS ONE 5:e9931
Chou KC, Wu ZC, Xiao X (2011) iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins. PLoS ONE 6:e18258
Claros MG, Brunak S, von Heijne G (1997) Prediction of N-terminal protein sorting signals. Curr Opin Struct Biol 7:394–398
Davidson PJ, Li SY, Lohse AG, Vandergaast R, Verde E, Pearson A, Patterson RJ, Wang JL, Arnoys EJ (2006) Transport of galectin-3 between the nucleus and cytoplasm. I. Conditions and signals for nuclear import. Glycobiology 16:602–611
Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular locations of proteins based on their N-terminal amino acid sequence. J Mol Biol 300:1005–1016
Esaka M, Yamada N, Kitabayashi M, Setoguchi Y, Tsugeki R, Kondo M, Nishimura M (1997) cDNA cloning and differential gene expression of three catalases in pumpkin. Plant Mol Biol 33:141–155
Gould SJ, Keller GA, Hosken N, Wilkinson J, Subramani S (1989) A conserved tripeptide sorts proteins to peroxisomes. J Cell Biol 108:1657–1664
Hancock RE, Nikaido H (1978) Outer membranes of gram-negative bacteria. XIX. Isolation from Pseudomonas aeruginosa PAO1 and use in reconstitution and definition of the permeability barrier. J Bacteriol 136:381–390
Höglund A, Dönnes P, Blum T, Adolph HW, Kohlbacher O (2006) MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition. Bioinformatics 22:1158–1165
Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai NK (2007) WoLF PSORT: protein localization predictor. Nucleic Acids Res 35:585–587
Jensen LJ, Gupta R, Blom N, Devos D, Tamames J, Kesmir C, Nielsen H, Staerfeld HH, Rapacki K, Workman C, Andersen CA, Knudsen S, Krogh A, Valencia A, Brunak S (2002) Prediction of human protein function from post-translational modifications and localization features. J Mol Biol 319:1257–1265
Kaundal R, Sahu SS, Verma R, Weirick T (2013) Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning. BMC Bioinf 14:S7
Kenri T, Seto S, Horino A, Sasaki Y, Sasaki T, Miyata M (2004) Use of fluorescent-protein tagging to determine the subcellular locations of mycoplasma pneumoniae proteins encoded by the cytadherence regulatory locus. J Bacteriol 186:6944–6955
Koch CA, Anderson D, Moran MF, Ellis C, Pawson T (1991) SH2 and SH3 domains: elements that control interactions of cytoplasmic signaling proteins. Science 252:668–674
Kumar RB, Xie YH, Das A (2000) Subcellular locations of the Agrobacterium tumefaciens T-DNA transport pore proteins: VirB8 is essential for the assembly of the transport pore. Mol Microbiol 6:608–617
Millar AH, Carrie C, Pogson B, Whelan J (2009) Exploring the function-location nexus: using multiple lines of evidence in defining the subcellular location of plant proteins. Plant Cell 21:1625–1631
Munro S, Pelham HR (1987) A C-terminal signal prevents secretion of luminal ER proteins. Cell 48:899–907
Nair R, Rost B (2002) Sequence conserved for subcellular localization. Protein Sci 11:2836–2847
Neufeld KL, Nix DA, Bogerd H, Kang Y, Beckerle MC, Cullen BR, White RL (2000) White adenomatous polyposis coli protein contains two nuclear export signals and shuttles between the nucleus and cytoplasm. Proc Natl Acad Sci U S A 97:12085–12090
Nilsson T, Jackson M, Peterson PA (1989) Short cytoplasmic sequences serve as retention signals for transmembrane proteins in the endoplasmic reticulum. Cell 58:707–718
Osumi T, Tsukamoto T, Hata S, Yokota S, Miura S, Fujiki Y, Hijikata M, Miyazawa S, Hashimoto T (1991) Amino-terminal presequence of the precursor of peroxisomal 3-ketoacyl-CoA thiolase is a cleavable signal peptide for peroxisomal targeting. Biochem Biophys Res Commun 181:947–954
Pfanner N, Rassow J, van der Klei IJ, Neupert W (1992) A dynamic model of the mitochondrial protein import machinery. Cell 68:999–1002
Shen HB, Chou KC (2010a) Nuc-PLoc:a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM. Protein Eng Des Sel 9:561–567
Shen HB, Chou KC (2010b) Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization. PLoS ONE 5:e11335
Small I, Peeters N, Legeai F, Lurin C (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4:1581–1590
Sprenger J, Fink JL, Teasdale RD (2006) Evaluation and comparison of mammalian subcellular localization prediction methods. BMC Bioinf 7:S3
Su EC, Chiu HS, Lo A, Hwang JK, Sung TY, Hsu WL (2007) Protein subcellular localization prediction based on compartment-specific features and structure conservation. BMC Bioinf 8:330
Swinkels BW, Gould SJ, Bodnar AG, Rachubinski RA, Subramani S (1991) A novel, cleavable peroxisomal targeting signal at the amino-terminus of the rat 3-ketoacyl-CoA thiolase. EMBO J 10:3255–3262
Verner K (1993) Co-translational protein import into mitochondria: an alternative view. Trends Biochem Sci 18:366–371
Wagner MJ, Stacey MM, Liu BA, Pawson T (2013) Molecular mechanisms of SH2- and PTB-domain-containing proteins in receptor tyrosine kinase signaling. Cold Spring Harb Perspect Biol 5:a008987
Wu ZC, Xiao X, Chou KC (2011) iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites. Mol Biosyst 2:3287–3297
Wu ZC, Xiao X, Chou KC (2012) iLoc-Gpos: a multi-layer classifier for predicting the subcellular localization of singleplex and multiplex Gram-positive bacterial proteins. Protein Pept Lett 19:4–14
Yu CS, Lin CJ, Hwang JK (2006) Prediction of protein subcellular localization. Protein Sci 64:643–651
Acknowledgments
We acknowledge the National Natural Science Foundation of China (Grant No. 31371543), the Plan for Scientific Innovation Talent of Henan Province (Grant No. 144200510012), and the Program for Innovative Research Team (in Science and Technology) in University of Henan Province (Grant No. 15IRTSTHN015) for financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Erhui Xiong and Chenyu Zheng contributed equally to this work.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Xiong, E., Zheng, C., Wu, X. et al. Protein Subcellular Location: The Gap Between Prediction and Experimentation. Plant Mol Biol Rep 34, 52–61 (2016). https://doi.org/10.1007/s11105-015-0898-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11105-015-0898-2