Abstract
Expression of synthetic proteins from intergenic regions of E. coli and their functional association was recently demonstrated (Dhar et al. in J Biol Eng 3:2, 2009. doi:10.1186/1754-1611-3-2). This gave birth to the question: if one can make ‘user-defined’ genes from non-coding genome—how big is the artificially translatable genome? (Dinger et al. in PLoS Comput Biol 4, 2008; Frith et al. in RNA Biol 3(1):40–48, 2006a; Frith et al. in PLoS Genet 2(4):e52, 2006b). To answer this question, we performed a bioinformatics study of all reported E. coli intergenic sequences, in search of novel peptides and proteins, unexpressed by nature. Overall, 2500 E. coli intergenic sequences were computationally translated into ‘protein sequence equivalents’ and matched against all known proteins. Sequences that did not show any resemblance were used for building a comprehensive profile in terms of their structure, function, localization, interactions, stability so on. A total of 362 protein sequences showed evidence of stable tertiary conformations encoded by the intergenic sequences of E. coli genome. Experimental studies are underway to confirm some of the key predictions. This study points to a vast untapped repository of functional molecules lying undiscovered in the non-expressed genome of various organisms.
Similar content being viewed by others
References
Cherian BS, Nair AS (2010) Protein location prediction using atomic composition and global features of the amino acid sequence. Biochem Biophys Res Commun 391:1670–1674
Dhar PK, Thwin CS, Tun K, Tsumoto Y, Maurer-Stroh S, Eisenhaber F, Surana U (2009) Synthesizing non-natural parts from natural genomic template. J Biol Eng 3:2. doi:10.1186/1754-1611-3-2
Dinger ME, Pang KC, Mercer TR, Mattick JS (2008) Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput Biol 4(11):e1000176
Dosztanyi Z, Magyar C, Tusnady G, Simon I (2003) SCide: identification of stabilization centers in proteins. Bioinformatics 19:899–900
Frith MC, Bailey TL, Kasukawa T, Mignone F, Kummerfeld SK, Madera M, Sunkara S et al (2006a) Discrimination of non-protein-coding transcripts from protein-coding mRNA. RNA Biol 3(1):40–48
Frith MC, Forrest AR, Nourbakhsh E, Pang KC, Kai C, Kawai J, Carninci P et al (2006b) The abundance of short proteins in the mammalian proteome. PLoS Genet 2(4):e52
Gallivan JP, Dougherty DA (1999) Cation-pi interactions in structural biology. Proc Natl Acad Sci USA 96:9459
Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, Bairoch A (2005) Protein identification and analysis tools on the ExPASy server. The Proteomics Protocols Handbook. Humana Press, New York, pp 571–607
Guex N, Peitsch MC (1997) SWISS-MODEL and the Swiss-Pdb viewer: an environment for comparative protein modeling. Electrophoresis 18:2714–2723
Harrison RS, Shepherd NE, Hoang HN, Ruiz-Gómez G, Hill TA, Driver RW, Desai VS et al (2010) Downsizing human, bacterial, and viral proteins to short water-stable alpha helices that maintain biological potency. Proc Natl Acad Sci USA 107(26):11686–11691
Jensen LJ, Gupta R, Blom N, Devos D, Tamames J, Kesmir C, Nielsen H, Stærfeldt HH, Rapacki K, Workman C, Andersen CAF, Knudsen S, Krogh A, Valencia A, Brunak S (2002) Ab initio prediction of human orphan protein function from post-translational modifications and localization features. J Mol Biol 319:1257–1265
Kageyama Y, Kondo T, Hashimoto Y (2011) Coding vs non-coding: translatability of short ORFs found in putative non-coding transcripts. Biochimie 93:1981–1986
Kondo T, Plaza S, Zanet J, Benrabah E, Valenti P, Hashimoto Y, Kobayashi S, Payre F, Kageyama Y (2010) Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis. Science 5989:336–339
Powers J-PS, Hancock REW (2003) The relationship between peptide structure and antibacterial activity. Peptides 24:1681–1691
Ramanathan K, Shanthi V, Rajasekaran R, Sudandiradoss C, Doss CGP, Sethumadhavan R (2011) Predicting therapeutic template by evaluating the structural stability of anti-cancer peptides: a computational approach. Int J Pept Res Ther 17(1):31–38
Tina KG, Bhadra R, Srinivasan N (2007) PIC: protein interactions calculator. Nucl Acids Res 35:W473–W476
Vriend G (1990) WHAT IF: a molecular modeling and drug design program. J Mol Graph 8:52–56
Yu CS, Chen YC, Lu CH, Hwang JK (2006) Prediction of protein subcellular localization. Prot Struct Funct Bioinform 64:643–651
Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinform 9:40
Acknowledgments
We sincerely thank the State Inter-University Centre of Excellence in Bioinformatics (SIUCEB), University of Kerala for the funding provided during this work.
Conflict of interest
The authors declare that they have no conflict of interests.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thomas, V., Raj, N., Varughese, D. et al. Predicting stable functional peptides from the intergenic space of E. coli . Syst Synth Biol 9, 135–140 (2015). https://doi.org/10.1007/s11693-015-9172-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11693-015-9172-z