Abstract
Many structural and functional properties of proteins can be described as a one-dimensional one-to-one mapping between residues of protein sequence and target structure or function. These residue level properties (RLPs) have been frequently predicted using neural networks and other machine learning algorithms. Here we present an algorithm to dynamically exclude from the neural network training, examples which are most difficult to separate. This algorithm automatically filters out statistical outliers causing noise and makes training faster without losing network ability to generalize. Different methods of sampling data for neural network training have been tried and their impact on learning has been analyzed.
Keywords
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Rost, B., Liu, J., Nair, R., Wrzeszczynski, K.O., Ofran, Y.: Automatic prediction of protein function. Cell Mol. Life Sci. 60(12), 2637–2650 (2003)
Moult, J.: A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15(3), 285–289 (2005)
Wolfson, H.J., Shatsky, M., Schneidman-Duhovny, D., Dror, O., Shulman-Peleg, A., Ma, B., Nussinov, R.: From structure to function: methods and applications. Curr. Protein Pept. Sci. 6(2), 171–183 (2005)
Schlessinger, A., Rost, B.: Protein flexibility and rigidity predicted from sequence. Proteins 61(1), 115–126 (2005)
Nguyen, M.N., Rajapakse, J.C.: Prediction of protein relative solvent accessibility with a two-stage SVM approach. Proteins 59(1), 30–37 (2005)
Ahmad, S., Gromiha, M.M., Sarai, A.: A Real value prediction of solvent accessibility from amino acid sequence. Proteins 50(4), 629–635 (2003)
Ahmad, S., Sarai, A.: PSSM-based prediction of DNA binding sites in proteins. BMC Bioinformatic 6, 33–35 (2005)
Ahmad, S., Gromiha, M., Sarai, A.: Analysis and Prediction of DNA-binding proteins and their binding residues based on Composition, Sequence and Structural Information. Bioinformatics 20, 477–486 (2004)
Malik, A., Ahmad, S.: Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network. BMC Structural B
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahmad, S. (2007). Dynamic Outlier Exclusion Training Algorithm for Sequence Based Predictions in Proteins Using Neural Network. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-75286-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)