[6] Blocks database and its applications

https://doi.org/10.1016/S0076-6879(96)66008-XGet rights and content

Abstract

Protein blocks consist of multiply aligned sequence segments without gaps that represent the most highly conserved regions of protein families. A database of blocks has been constructed by successive application of the fully automated PROTOMAT system to lists of protein family members obtained from Prosite documentation. Currently, Blocks 8.0 based on protein families documented in Prosite 12 consists of 2884 blocks representing 770 families. Searches of the Blocks Database are carried out using protein or DNA sequence queries, and results are returned with measures of significance for both single and multiple block hits. The database has also proved useful for derivation of amino acid substitution matrices (the Blosum series) and other sets of parameters. WWW and E-mail servers provide access to the database and associated functions, including a block maker for sequences provided by the user.

References (45)

  • P. Green

    Curr. Opin. Struct. Biol.

    (1994)
  • S. Henikoff et al.

    Genomics

    (1994)
  • S. Henikoff et al.

    J. Mol. Biol.

    (1994)
  • S. Henikoff et al.

    Gene

    (1995)
  • D.M. Nikoloff et al.

    J. Biol. Chem.

    (1994)
  • S.F. Altschul et al.

    J. Mol. Biol.

    (1990)
  • T.F. Smith et al.

    J. Mol. Biol.

    (1981)
  • S.F. Altschul

    J. Mol. Biol.

    (1991)
  • M.P. Brown et al.
  • P.R. Sibbald et al.

    J. Mol. Biol.

    (1990)
  • M. Gerstein et al.

    J. Mol. Biol.

    (1994)
  • S.F. Altschul et al.

    J. Mol. Biol.

    (1989)
  • E.V. Koonin et al.

    EMBO J.

    (1994)
  • M.D. Adams et al.

    Science

    (1991)
  • S.G. Oliver et al.

    Nature (London)

    (1992)
  • S. Henikoff et al.

    Nucleic Acids Res.

    (1991)
  • S. Henikoff et al.
  • S. Henikoff et al.
  • T.D. Schneider et al.

    Nucleic Acids Res.

    (1990)
  • S. Henikoff et al.
  • I.B. Dodd et al.

    Nucleic Acids Res.

    (1990)
  • A. Bairoch

    Nucleic Acids Res.

    (1992)
  • Cited by (112)

    • Ranking and clustering of Drosophila olfactory receptors using mathematical morphology

      2019, Genomics
      Citation Excerpt :

      Different methods of similarity among the DORs allow inferring functional and evolutionary relationships among the sequences [5-9]. There exist some methodologies for the analysis of protein sequences in classifying them [10-17]. The next-generation sequencing technologies produce huge datasets that have to be systematically analyzed with novel computational algorithms.

    • Bioinformatics analysis to assess potential risks of allergenicity and toxicity of HRAP and PFLP proteins in genetically modified bananas resistant to Xanthomonas wilt disease

      2017, Food and Chemical Toxicology
      Citation Excerpt :

      The full-length FASTA3 was the primary search which gives optimum alignments using the default criteria defined by Pearson (1999). The default scoring matrix is BLOSUM 50 (Henikoff and Henikoff, 1992, 1996). The penalty for each gap inserted into query or searched sequences to obtain optimal alignments is calculated as (q + r*k), where q (10) is an initial penalty for each independent gap, r (2) is a penalty for each aa position within the gap and k is the number of aa positions within the gap (Reese and Pearson, 2002).

    • Evaluation of Bar, Barnase, and Barstar recombinant proteins expressed in genetically engineered Brassica juncea (Indian mustard) for potential risks of food allergy using bioinformatics and literature searches

      2015, Food and Chemical Toxicology
      Citation Excerpt :

      The default search and scoring described in the Allergenonline.org database website were used. The default scoring matrix is BLOSUM 50 (Henikoff and Henikoff, 1992, 1996). The gap penalty for each inserted gap in the query or aligned protein sequences is calculated as (−q + −r*k), where q (10) is an initial penalty for each independent gap, r (2) is a penalty for each amino acid position within the gap and k is the number of amino acid positions within the gap (Reese and Pearson, 2002), default word size (ktup) two (Pearson, 2000), and an expectation value score (E-value) of 1 (highest number, representing similarity potential significant alignment in the small AOL database).

    View all citing articles on Scopus
    View full text