Abstract
Back-propagation, feed-forward neural networks are used to predict a-helical transmembrane segments of proteins. The networks are trained on the few membrane proteins whose transmembrane α-helix domains are known to atomic or nearly atomic resolution. When testing is performed with a jackknife procedure on the proteins of the training set, the fraction of total correct assignments is as high as 0.87, with an average length for the transmembrane segments of 20 residues. The method correctly fails to predict any transmembrane domain for porin, whose transmembrane segments are β-sheets. When tested on globular proteins, lower and upper limits of 1.6 and 3.5% for a total of 26826 residues are determined for the mispredicted cases, indicating that the predictor is highly specific for α-helical domains of membrane proteins. The predictor is also tested on 37 membrane proteins whose transmembrane topology is partially known. The overall accuracy is 0.90, two percentage points higher than that obtained with statistical methods. The reliability of the prediction is 100% for 60% of the total 18242 predicted residues of membrane proteins. Our results show that the local directional information automatically extracted by the neural networks during the training phase plays a key role in determining the accuracy of the prediction.
Similar content being viewed by others
References
Argos P, Rao JKM, Hargrave PA (1982) Structural prediction of membrane-bound proteins. Eur J Biochem 128:565–575
Bairoch A, Boeckmann B (1992) The SWISS-PROT protein sequence data bank. Nucl Acids Res 20:2019–2022
Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M (1977) The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol 112:535–542
Black SD, Coon MJ (1982) Structural features of liver microsomal NADPH-cytochrome P-450 reductase. J Biol Chem 257:5929–5938
Bowie JU, Luthy R, Eisenberg DA (1991) Method to identify protein sequences that fold into a known three-dimensional structure. Science 235:164–170
Brandl CJ, Green NM, Korczak B, MacLennan DH (1986) Two Ca2+ ATPasc genes: homologies and mechanistic implication of deduced amino acid sequences. Cell 44:597–607
Brunisholz RA, Zuber H, Valentine J, Lindsay JG, Woolley KJ, Cogdell RJ (1986) The membrane location of the B890-complex from Rhodospirillum rubrum and the effect of carotenoid on the conformation of its two apoproteins exposed at the cytoplasmic surface. Biochim Biophys Acta 849:925–303
Burgi R, Suter F, Zuber H (1987) Arrangement of the light-harvesting chlorophyll alb protein complex in the thylakoid membrane. Biochim Biophys Acta 890:346–351
Chothia C (1992) One thousand families for the molecular biologist. Nature 357:543–544
Chothia C, Finkelstein AV (1990) The classification and origins of protein folding patterns. Annu Rev Biochem 59:1007–1039
Compiani M, Fariselli P, Casadio R (1991) Neural networks extracting general features of protein secondary structures. In: Caianiello ER (ed) Parallel Architectures and Neural Networks. World Scientific, Singapore, pp 227–237
Compiani M, Fariselli P, Casadio R (1992) The statistical behaviour of perceptrons. In: Caianiello ER (ed) WIRN Vietri-92. World Scientific, Singapore, pp 111–117
Cornette JL, Cease KB, Margalit H, Sponge JL, Berzofsky JA, De Lisi C (1987) Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. J Mol Biol 195:659–685
Degli Esposti M, Crimi M, Venturoli G (1990) A critical evaluation of the hydropathy profile of membrane proteins. Eur J Biochem 190:207–219
Deisenhofer J, Epp O, Miki K, Huber R, Michel H (1985) Structure of the protein subunits in the photosynthetic reaction centre of Rhodopseudomonas viridis at 3 Å resolution. Nature 318:618–624
Dohlman HG, Thorner J Caron, Lefkowitz M (1991) Model systems for the study of the seven-transmembrane segment receptors. Annu Rev Biochem 60:653–688
Edelman J (1993) Quadratic minimization of predictors for protein secondary structure: application to transmembrane α-helixes. J Mol Biol 232:165–191
Edelman J, White SH (1989) Linear optimization of predictors for secondary structure: application to transbilayer segments of membrane proteins. J Mol Biol 210:195–209
Eisenberg D, Schwarz E, Komaromy M, Wall R (1984) Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol 179:125–142
Engelman DM, Steitz TA, Goldman A (1986) Identifying nonpolar transbilayer helixes in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem 15:321–353
Fariselli P, Compiani M, Casadio R (1993) Predicting secondary structures of membrane proteins with neural networks. Eur Biophys J 22:41–51
Fasman GD (1989) The development of the prediction of protein structure. In: Fasman GD (ed) Prediction of Protein Structure and the Principles of Protein Conformation. Plenum Press, New York, pp 193–316
Fasman GD, Gilbert WA (1990) The prediction of transmembrane protein sequences and their conformation: an evaluation. TIBS 15:89–92
Feher G, Allen JP, Okamura MY Rees DC (1989) Structure and function of bacterial photosynthetic reaction centers. Nature 339:111–116
Foster DL, Boublik M, Kaback HR (1983) Structure of the lac carriers protein of Escherichia coli. J Biol Chem 258:31–34
Fox RO, Richards FM (1982) A voltage gated ion channel model inferred from the crystal structure of alamethicin at 1.5 Å resolution. Nature 300:325–330
Fujii-Kuriyama Y Mizukami Y Kawajiri K, Sogawa K, Muramatsu M (1982) Primary structure of a cytochrome P-450: coding nucleotide sequence of phenobarbital-inducible cytochrome P-450 messenger RNA from rat liver. Proc Natl Acad Sci, USA 79:2793–2797
Garnier J, Robson B (1989) The GOR method for predicting secondary structures in proteins. In: Fasman GD (ed) Prediction of Protein Structure and the Principles of Protein Conformation. Plenum Press, New York, pp 417–465
Garnier J, Levin JM (1991) The protein code: what is the present status? CABIOS 7:133–142
Gibrat JF, Garnier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. J Mol Biol 198:425–443
Gimlich RL, Kumar NM, Gilula NB (1988) Sequence and developmental expression of mRNA coding for a gap junction protein in Xenopus. J Cell Biol 107:1065–1073
Grenningloh G, Rienitz A, Schmitt B, Methfessel C, Zensen M, Beyreuter K, Gundelfingr ED, Betz H (1987) The strychnine-binding subunit of the glycine receptor shows homology with nicotinic acetylcholine receptors. Nature 328:215–220
Henderson R, Baldwin JM, Ceska TA, Zemlin F, Beckmann E, Downing KH (1990) Model for the structure of bacteriorhodopsin based on high resolution electron cryo-microscopy. J Mol Biol 213:899–929
Hirst JD, Sternberg JE (1992) Prediction of structural and functional features of proteins and nucleic acid sequences by artificial neural networks. Biochemistry 31:7211–7218
Holley HL, Karplus M (1989) Protein secondary structure prediction with a neural network. Proc Natl Acad Sci, USA 85:152–156
Holm L, Saraste M, Wilkstrom M (1987) Structural models of the redox centres in cytochrome oxidase. EMBO J 6:2819–2823
Jennings ML (1989) Topography of membrane proteins. Annu Rev Biochem 58:999–1027
Jones DT, Taylor WR, Thornton JM (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33:3038–3049
Kabsch W Sander C (1983) Dictionary of protein secondary structure: pattern of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2673
Kayano T, Noda M, Flockerzi V, Takahashi H, Numa S (1988) Primary structure of rat brain sodium channel III deduced from the cDNA sequence. FEBS Letters 228:187–194
Khorana HG (1992) Rhodopsin, photoreceptor of the rod cell. J Biol Chem 267:1–4
Klein P, Kanehisa M, De Lisi C (1985) The detection and classification of membrane-spanning proteins. Biochim Biophys Acta 815:468–476
Kneller DG, Cohen FE, Langridge R (1990) Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol 214:171–182
Kopito RR, Lodish HF (1985) Primary structure and transmembrane orientation of the murine anion exchange protein. Nature 316:234–238
Kyte J, Doolittle RF (1982) A simple method for displaying the hydrophobic character of a protein. J Mol Biol 157:103–132
Kühlbrandt W, Wang DN, Fujiyoshi Y (1994) Atomic model of plant light-harvesting complex by electron crystallography. Nature 367:614–621
Kuhn LA, Leigh JS (1985) A statistical technique for predicting membrane protein structure. Biochim Biophys Acta 828:351–361
Lipman DJ, Pearson WL (1985) Rapid and sensitive protein similarity searches. Science 227:1435–1441
Lippmann RP (1987) An introduction to computing with neural nets. IEEE ASSP Magazine 4:4–22
Manoil C, Beckwith J (1986) A genetic approach to analyzing membrane protein topology. Science 233:1403–1408
Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405:442–451
McGovern K, Ehrmann M, Beckwith J (1991) Decoding signals for membrane proteins using alkaline phosphatase fusions. EMBO J 10:2773–2782
Mellor IR, Thomas DH, Sansom MSP (1988) Properties of ion channels formed by Staphylococcus aureus δ-toxin. Biochim Biophys Acta 942:280–294
Michel H (1983) Crystallization of membrane proteins. Trends Biochem Sci 8:56–59
Michel H, Weyer KA, Gruenberg H, Dunger I, Osterhelt D, Lottspeich F (1986) The light and medium subunits of the photosynthetic reaction center from Rhodopseudomonas viridis: isolation of the genes, nucleotide and amino acid sequence. EMBO J 5:1149–1158
Miller C (1991) 1990: annus mirabilis of potassium channels. Science 252:1092–1096
Moore KE, Miura SA (1987) Small hydrophobic domain anchors leader peptidase to the cytoplasmic membrane of Escherichia coli. J Biol Chem 262:8806–8813
Müller D, Reinhardt J (1990) Neural networks. Springer, Berlin Heidelberg New York
Noda M, Takahashi H, Tanabe T, Toyosato M, Kikyotani S, Furutani Y, Hirose T, Takashima H, Inayama S, Miyata T, Numa S (1983a) Structural homology of Torpedo californica acetylcholine receptor subunits. Nature 302:528–532
Noda M, Takahashi H, Tanabe T, Toyosato M, Kikyotani S, Hirose T, Asai M, Takashima H, Inayama S, Miyata T, Numa S (1983b) Primary structure of β-and δ-subunit precursors of Torpedo californica deduced from cDNA sequences. Nature 301:251–255
Ozols J, Carr SA, Strittmatter P (1984) Identification of the NH2-terminal blocking group of NADH-cytochrome b 5 reductase as myristic acid and the complete amino acid sequence of the membrane binding domain. J Biol Chem 259:13349–13354
Peralta EG, Ashkenazi A, Winslow JW, Smith DH, Ramachandran J, Capon DJ (1987) Distinct primary structures, ligand-binding properties and tissue specific expression of four human muscarinic acetylcholine receptors. EMBO J 6:3923–3929
Persson B. Argos P (1994) Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol 237:182–192
Popot JL, De Vitry C (1990) On the microassembly of integral membrane proteins. Annu Rev Biophys Biophys Chem 19:369–403
Popot JL, Dinh DP, Dautigny A (1991) Major myelin proteolipid: the 4-α-helix topology. J Membr Biol 120:233–246
Presnell SR, Cohen FE (1993) Artificial neural networks for pattern recognition in biochemical sequences. Annu Rev Biophys Biomol Struct 22:283–298
Qian N, Sejnowski TJ (1988) Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202:865–884
Rao JKM, Argos P (1986) A conformational preference parameter to predict helices in integral membrane proteins. Biochim Biophys Acta 869:197–214
Ross AH, Radhakrishnan R, Robson RJ, Khorana HG (1982) The transmembrane domain of glycophorin A as studied by crosslinking using photoactivable phospholipids. J Biol Chem 257:4152–4161
Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232:584–599
Rost B, Sander C, Schneider R (1994) Redefining the goals of protein secondary structure prediction. J Mol Biol 235:13–26
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representation by back-propagation errors. Nature 323:533–536
Sayre RT, Andersson B, Bogorad L (1986) The topology of a membrane protein: the orientation of the 32 kD Qb-binding chloroplast thylakoid membrane proteins. Cell 47:601–608
Schofield PR, Darlison MG, Fujita N, Burt DR, Stephenson FA (1987) Sequence and functional expression of the GABAA receptor shows a ligand-gated receptor super-family. Nature 328:221–227
Schulz GE (1988) A critical evaluation of methods for prediction of protein secondary structures. Annu Rev Biophys Chem 17:1–21
Schulz GE, Schirmer RH (1979) Principles of protein structure. Springer, Berlin Heidelberg New York
Shull GE, Lane LK, Lingrel JB (1986) Amino acid sequence of the β-subunit of the (Na+ + K+)ATPase deduced from cDNA. Nature 321:429–431
Stader J, Matsumura P, Vacante D, Dean DE, Macnab RM (1986) Nucleotide sequence of the Escherichia coli motB gene and sitelimited incorporation of its product into the cytoplasmic membrane. J Bacteriol 166:244–252
Strosberg AD (1991) Structure/function relationship of proteins belonging to the family of receptors coupled to GTP-binding proteins. Eur J Biochem 29:11009–11023
Stumher W, Ruppersberg JP, Schroter KH, Sakmann B, Stoker M, Giese KP, Perschke A, Baumann A, Pongs O (1989) Molecular basis of functional diversity of voltage-gated potassium channels in mammalian brain. EMBO J 8:3235–3244
Von Heijne G (1988) Trascending the impenetrable: how proteins come to term with membrane. Biochim Biophys Acta 947:307–333
Von Heijne G (1992) Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225:487–494
Takagaki Y, Radhakrishnan R, Gupta CM, Khorana HG (1983) The membrane-embedded segment of cytochrome b 5 as studied by cross-linking with photoactivable phospholipids. I. The transferable form. J Biol Chem 258:9128–9135
Tanabe T, Takeshima H, Mikami A, Flockerzi V Takahashi H (1987) Primary structure of the receptor for calcium channel blockers from skeletal muscle. Nature 328:313–318
Terwillinger TC, Weissman L, Eisenberg D (1982) The structure of melittin in the form I crystals and its implication for melittin's lytic and surface activities. Biophys J 27:353–361
Traxler B, Boyd D, Beckwith J (1993) The topological analysis of integral membrane proteins. J Membr Biol 132:1–11
Walker JE (1992) The NADH:ubiquinone oxidoreductase (complex I) of respiratory chains. Q Rev Biophys 25:253–324
Weiss MS, Kreush A, Shiltz E, Nestel U, Welte W, Weckesser J, Schulz GE (1991) The structure of porin from Rhodobacter capsulatus at 1.8 Å resolution. FEBS Lett 280:379–382
Author information
Authors and Affiliations
Additional information
Correspondence to: R. Casadio
Rights and permissions
About this article
Cite this article
Casadio, R., Fariselli, P., Taroni, C. et al. A predictor of transmembrane α-helix domains of proteins based on neural networks. Eur Biophys J 24, 165–178 (1996). https://doi.org/10.1007/BF00180274
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00180274