IMR Press / FBL / Volume 15 / Issue 3 / DOI: 10.2741/3647

Frontiers in Bioscience-Landmark (FBL) is published by IMR Press from Volume 26 Issue 5 (2021). Previous articles were published by another publisher on a subscription basis, and they are hosted by IMR Press on imrpress.com as a courtesy and upon agreement with Frontiers in Bioscience.

Article
Computational identification and analysis of protein short linear motifs
Show Less
1 UCD Complex and Adaptive Systems Laboratory, University College Dublin, Dublin, Ireland
2 UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
3 UCD School of Medicine and Medical Sciences, University College Dublin, Dublin, Ireland
4 EMBL Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
5 School of Biological Sciences, University of Southampton, Southampton, United Kingdom
Front. Biosci. (Landmark Ed) 2010, 15(3), 801–825; https://doi.org/10.2741/3647
Published: 1 June 2010
Abstract

Short linear motifs (SLiMs) in proteins can act as targets for proteolytic cleavage, sites of post-translational modification, determinants of sub-cellular localization, and mediators of protein-protein interactions. Computational discovery of SLiMs involves assembling a group of proteins postulated to share a potential motif, masking out residues less likely to contain such a motif, down-weighting shared motifs arising through common evolutionary descent, and calculation of statistical probabilities allowing for the multiple testing of all possible motifs. Much of the challenge for motif discovery lies in the assembly and masking of datasets of proteins likely to share motifs, since the motifs are typically short (between 3 and 10 amino acids in length), so that potential signals can be easily swamped by the noise of stochastically recurring motifs. Focusing on disordered regions of proteins, where SLiMs are predominantly found, and masking out non-conserved residues can reduce the level of noise but more work is required to improve the quality of high-throughput experimental datasets (e.g. of physical protein interactions) as input for computational discovery.

Share
Back to top