Prediction of Peroxisomal Targeting Signal 1 Containing Proteins from Amino Acid Sequence

https://doi.org/10.1016/S0022-2836(03)00319-XGet rights and content

Abstract

Peroxisomal matrix proteins have to be imported into their target organelle post-translationally. The major translocation pathway depends on a C-terminal targeting signal, termed PTS1. Our previous analysis of sequence variability in the PTS1 motif revealed that, in addition to the known C-terminal tripeptide, at least nine residues directly upstream are important for signal recognition in the PTS1–Pex5 receptor complex. The refined PTS1 motif description was implemented in a prediction tool composed of taxon-specific functions (metazoa, fungi, remaining taxa), capable of recognising potential PTS1s in query sequences. The composite score function consists of classical profile terms and additional terms penalising deviations from the derived physical property pattern over sequence segments. The prediction algorithm has been validated with a self-consistency and three different cross-validation tests. Additionally, we tested the tool on a large set of non-peroxisomal negatives, on mutation data, and compared the prediction rate to the PTS1 component of the PSORT2 program. The sensitivity of our predictor in recognising documented PTS1 signal containing proteins is close to 90% for reliable prediction. The predictor distinguishes even SKL-appended non-peroxisomally targeted proteins such as a mouse dihydrofolate reductase-SKL construct. The corresponding rate of false positives is not worse than 0.8%; thus, the tool can be applied for large-scale unsupervised sequence database annotation. A scan of public protein databases uncovered a number of yet uncharacterised proteins for which the PTS1 signal might be critical for biological function. The predicted presence of a PTS1 signal implies peroxisomal localisation in the absence of N-terminal targeting sequences such as the mitochondrial import signal.

Introduction

Sorting of proteins to subcellular compartments of the eukaryotic cell is a complex process that involves a set of concurring pathways which are dependent on the presence of specific translocation signals. Protein targeting to the peroxisomal matrix relies on two known signals, PTS11 and PTS2.2., 3. Additional mechanisms are probably involved in peroxisomal import.4., 5., 6. Unlike N-terminal signal peptides, the PTS1 motif lies at the C terminus and triggers peroxisomal import post-translationally, when the substrate protein is already folded.7 PTS1 containing proteins are recognised in the cytosol by their receptor molecule Pex58., 9. before being translocated through the peroxisomal membrane.10

The PTS1 motif has initially been characterised as the C-terminal consensus tripeptide SKL.1., 11. Further investigations pointed to variations in the accepted range of residues, upstream sequence elements that modulate targeting efficiency, and to taxonomic differences in signal recognition.11., 12., 13., 14., 15., 16., 17.

Our previous analysis of sequence variability in the PTS1 motif18 revealed that at least 12 C-terminal residues are involved in Pex5–PTS1 complex formation. Based on this motif description, we developed a prediction tool capable of quantifying the detected requirements in query sequences. The implemented score functions reflect differences in substrate specificity between groups of species represented in our training database.

Section snippets

Outline of the prediction algorithm

The learning sets used for the parametrisation of the prediction program were those used in our previous characterisation of the PTS1 motif.18 This heterogeneous database consisted of 150 oligopeptides tested for interaction with Pex5 (called “LH set”), and 205 sequences retrieved from publicly available databases (called “SW set”).18 The insight obtained in the analysis of features encoding the PTS1 signal was formalised with a score function S (equation (7) in the Methodological Details). In

Parametrisation of the score function. I. The profile score term Sprofile

To account for the different information content of the LH and SW sequence sets, the profile score is calculated by summing up three distinct terms, each calculated using a different profile matrix and adjusted by a weighting factor αl in addition to a fourth term Stri that penalises unusual combinations of residues within the C-terminal tripeptide (equation (1)):Sprofile=Stri+l=13αlSprofilelThe final profile matrices Sli(a) for amino acid type a at position i in the alignment of learning set

Acknowledgements

The authors are grateful for continuous support from Boehringer Ingelheim. This project has been partly funded by the Fonds zur Förderung der wissenschaftlichen ForschungÖsterreichs (FWF grant P15037), by the Austrian National Bank (OeNB—Österreichische Nationalbank) and by the GENAU bioinformatics project (BMBWK Austria).

References (40)

  • K.A. Sacksteder et al.

    MCD encodes peroxisomal and cytoplasmic forms of malonyl-CoA decarboxylase and is mutated in malonyl-CoA decarboxylase deficiency

    J. Biol. Chem.

    (1999)
  • T.J. Kotti et al.

    In mouse α-methylacyl-CoA racemase, the same gene product is simultaneously located in mitochondria and peroxisomes

    J. Biol. Chem.

    (2000)
  • B. Eisenhaber et al.

    Prediction of potential GPI-modification sites in proprotein sequences

    J. Mol. Biol.

    (1999)
  • S. Maurer-Stroh et al.

    N-terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence

    J. Mol. Biol.

    (2002)
  • S.G. Gould et al.

    Identification of a peroxisomal targeting signal at the carboxy terminus of firefly luciferase

    J. Cell Biol.

    (1987)
  • B.W. Swinkels et al.

    A novel, cleavable peroxisomal targeting signal at the amino-terminus of the rat 3-ketoacyl-CoA thiolase

    EMBO J.

    (1991)
  • F. Kragler et al.

    Two independent targeting signals in catalase A of it Saccharomyces cerevisiae

    J. Cell Biol.

    (1993)
  • Y. Elgersma et al.

    Peroxisomal and mitochondrial carnitine acetyltransferases of Saccharomyces cerevisiae are encoded by a single gene

    EMBO J.

    (1995)
  • P. Walton et al.

    Import of stably folded proteins into peroxisomes

    Mol. Biol. Cell

    (1995)
  • S. Gould et al.

    Opinion: peroxisomal-protein import: is it really that complex?

    Nature Rev. Mol. Cell. Biol.

    (2002)
  • Cited by (170)

    • Lipid droplet-associated kinase STK25 regulates peroxisomal activity and metabolic stress response in steatotic liver

      2020, Journal of Lipid Research
      Citation Excerpt :

      However, due to the high degree of colocalization of PLIN2 and PMP70 in steatotic liver, it has not been possible to convincingly conclude by using immunofluorescence microscopy whether STK25 only localizes to the LDs or is also present in the peroxisomal subpopulation. Notably, we found no peroxisomal localization motifs in STK25 protein sequence using the PTS1 Predictor, the most cited prediction model for peroxisome targeting (58, 59). Furthermore, to our knowledge, STK25 has not been identified among peroxisomal proteins by any quantitative or nonquantitative proteomic studies or by entries in the UniProtKB and Compartments knowledge channel databases (60).

    View all citing articles on Scopus
    View full text