FeatureMap3D—a tool to map protein features and sequence conservation onto homologous structures in the PDB

Wernersson, Rasmus; Rapacki, Kristoffer; Stærfeldt, Hans-Henrik; Sackett, Peter Wad; Mølgaard, Anne

doi:10.1093/nar/gkl227

Abstract

FeatureMap3D is a web-based tool that maps protein features onto 3D structures. The user provides sequences annotated with any feature of interest, such as post-translational modifications, protease cleavage sites or exonic structure and FeatureMap3D will then search the Protein Data Bank (PDB) for structures of homologous proteins. The results are displayed both as an annotated sequence alignment, where the user-provided annotations as well as the sequence conservation between the query and the target sequence are displayed, and also as a publication-quality image of the 3D protein structure with the selected features and sequence conservation enhanced. The results are also returned in a readily parsable text format as well as a PyMol ( http://pymol.sourceforge.net/ ) script file, which allows the user to easily modify the protein structure image to suit a specific purpose. FeatureMap3D can also be used without sequence annotation, to evaluate the quality of the alignment of the input sequences to the most homologous structures in the PDB, through the sequence conservation colored 3D structure visualization tool. FeatureMap3D is available at: http://www.cbs.dtu.dk/services/FeatureMap3D/ .

INTRODUCTION

The 3D structure and flexibility of proteins determine their function in biological processes. The reactive mechanism and specificity of enzymes is determined by the active site residues location relative to each other in the protein structure, post-translational modification of proteins, such as glycosylation or phosphorylation affect residues on the surface of proteins and localization signals, such as Nuclear Export Signals (NES) are a part of the 3D protein structure. Alternative splicing of genes can only result in functional proteins if the exonic structure is compatible with a foldable protein structure. When studying protein features, it is relevant to investigate their localization in the biologically functioning form of the protein: the 3D protein structure ( 1 , 2 ).

The protein sequence databases are growing at a much faster rate than the Protein Structure Databank (PDB) ( 3 ). However, with a few notable exceptions ( 4 , 5 ), it is generally believed that if two proteins share 50% or higher sequence identity, their structures are expected to have the same overall fold ( 6 ). Although there are examples of mutations that dramatically affect the structure of a protein ( 7 ), most point mutations outside the catalytic site have relatively small structural effects ( 8 ). Therefore, provided sufficiently high sequence homology, it is possible to transfer structural information from proteins in the PDB to their structurally uncharacterized homologues.

The FeatureMap3D server can be used in two ways. If the user simply needs to perform a BLAST ( 9 , 10 ) search of a sequence against the PDB, a protein sequence in FASTA format can be submitted to the FeatureMap3D server. If the search results in one or more hits, the PDB structure of the homologous protein(s) will be shown in a publication quality image, with the sequence conservation between the query sequence and the target protein structure mapped onto the structure in color. The alignment of the query and the target sequence is also given, along with the sequence numbering of the two sequences and the DSSP secondary structure annotation ( 11 ). If an active site is annotated in the PDB structure, it is automatically labeled in the sequence alignment and the active site residues are shown in the figure in stick representation. This functionality works independently of the user specified annotation mentioned below.

The FeatureMap3D server can also be used with pre-annotated sequences, to show directly the localization of protein features in the 3D structure of a homologous protein. The annotation can be provided in two ways: (i) using a descriptive format in a separate input field for annotation a FASTA file (useful for single residue annotation), or (ii) using a TAB format file, which contains both sequence and annotation information directly (detailed description on the website).

The location of any such annotated feature will be displayed at the corresponding site in the structure of the hit, by highlighting the amino acid residue of the hit structure at that position. The server has a number of predefined graphical representations of annotation for both animo acid side-chain and backbone—see Table 1 for details. The hit structure therefore does not need to have N-glycosylation or even an asparagine at an annotated N-glycosylation site—the image simply shows where in the structure of the hit the glycosylated residue would be, based on the sequence alignment shown below the figure.

The structure is colored by sequence conservation, which makes it easy to see, whether the feature of interest is located in a highly conserved part of the structure, or whether it is in a region of poor sequence conservation. In the latter case, the local 3D structure of the query sequence is less likely to be well represented by the structure of the hit protein.

Although other public domain tools and WWW servers are able to perform BLAST searches against the PDB ( 12 ) or to map annotated features onto 3D structures ( 13 ), this is to our knowledge the first server, which combines these tools and which provides an easy means of producing publication quality images of the results.

SOFTWARE FEATURES

Basic functionality

FeatureMap3D works according to the following workflow.

Step 1: searching

All input sequences are aligned to the PDB using blastp ( 10 ).
Hits against structures that are not based on X-ray or NMR are discarded.
Hits not fulfilling the user-adjustable criteria for significance are discarded.
The best hit is selected. Optimization criteria: best homology and best resolution in combination.
A report file (the ‘GetStruct report’) containing the mapping between the input data and the PDB match is created. This file also contains the optional sequence feature annotation of the input sequence and a number of calculated features from the PDB entry.

Step 2: visualizing

Based on the ‘GetStruct report’ a PyMol script ( ^* .pml) is autogenerated, which colors the matched PDB structure according to coverage and quality of the hit and the optional protein feature annotation supplied by the user. Using the generated PyMol script (which is available to download on it’s own) a publication quality ray-traced image is generated.
Also based on the ‘GetStruct report’ an annotated pairwise alignment is generated, containing both information about coverage and homology as well as protein feature annotation.

Finally the following files are offered as both separate downloads as well as in a single packed archive: GetStruct report, PDB file, ray-traced 3D image and the PyMol script. This makes it very easy for the user to download all the relevant data and manipulate the colored PDB structure in PyMol on the local computer. Local processing of the colored structure requires the PyMol program, available separately for download at http://pymol.sourceforge.net/ .

Integrating annotation of protein features

In addition to working on standard FASTA files, FeatureMap3D has the option of working on files containing annotation of protein features, for example annotated glycosylation or phosphorylation sites, or annotation of the underlying exon structure ( 14 ). Table 1 shows an overview of the annotation recognized by FeatureMap3D.

EXAMPLES OF USE

Mapping sequence homology and the underlying exon structure

Figure 1 illustrates two of the main features of FeatureMap3D, namely the color-coding of the quality of the sequence alignment ( Figure 1A ) and the color-coding of user defined sequence annotation ( Figure 1B ). The annotated pairwise alignment generated by the server is shown in Figure 1C.

The data shown in Figure 1 was generated as follows. A TAB file containing the DNA sequence and Intron/Exon structure of the Columba livia Alpha-A globin was constructed using the FeatureExtract software ( 15 ) on the GenBank entry AB001981. Next the DNA file was translated using the Virtual Ribosome software described elsewhere in this issue ( 14 ), yielding a TAB file containing both protein sequence and annotation of the underlying exon structure. Finally the peptide sequence was submitted to the FeatureMap3D server with ( Figure 1B and C ) and without ( Figure 1 A) the accompanying exon numbering annotation.

Mapping protein feature annotation onto a PDB structure

Rhamnogalacuturonan acetylesterase is an example of a protein with known active site residues and two N-glycosylation sites ( 16 ). The protein sequence (in FASTA format) was annotated using the option of entering a small descriptive annotation in the manual annotation text-field on the FeatureMap3D server. The three active site residues were labeled with ‘A’, and the two N-glycosylation sites were labeled with ‘N’. After submitting the query to the FeatureMap3D server, the zipped archive containing all relevant files was downloaded. The image in Figure 2 was prepared from the extracted PyMol script file by rotating the molecule, selecting the ‘stereo’ option, zooming to get the right size of the molecule and clicking on the ‘Ray’ button in PyMol. From the PyMol script file it is possible to customize the figure for many other views.

Table 1

Types of annotation recognized by FeatureMap3D

Letter	Description	Color	Graphics
.	Null annotation	.	.
A	Active site	yellow	stick
N	N-glycosylation	red	spheres
O	O-glycosylation	purple	spheres
S	S-phosphorylation	cyan	spheres
T	T-phosphorylation	slate	spheres
Y	Y-phosphorylation	blue	spheres
U	Tyr-sulfation	orange	spheres
X	Generic PTM	white	stick
0	Custom backbone color	black	.
1	Custom backbone color	white/slate	.
2	Custom backbone color	red	.
3	Custom backbone color	cyan	.
4	Custom backbone color	purple	.
5	Custom backbone color	green	.
6	Custom backbone color	blue	.
7	Custom backbone color	yellow	.
8	Custom backbone color	orange	.
9	Custom backbone color	brown	.

Letter	Description	Color	Graphics
.	Null annotation	.	.
A	Active site	yellow	stick
N	N-glycosylation	red	spheres
O	O-glycosylation	purple	spheres
S	S-phosphorylation	cyan	spheres
T	T-phosphorylation	slate	spheres
Y	Y-phosphorylation	blue	spheres
U	Tyr-sulfation	orange	spheres
X	Generic PTM	white	stick
0	Custom backbone color	black	.
1	Custom backbone color	white/slate	.
2	Custom backbone color	red	.
3	Custom backbone color	cyan	.
4	Custom backbone color	purple	.
5	Custom backbone color	green	.
6	Custom backbone color	blue	.
7	Custom backbone color	yellow	.
8	Custom backbone color	orange	.
9	Custom backbone color	brown	.

In the Color and Graphics columns ‘.’ means ‘No effect’. The description of the type of annotation is only meant as a guide-line, and the annotation letters can be freely used as a mean for highlighting any kind of feature (e.g. disulfide bridges can be annotated with ‘X’ to mark the positions in white stick representation or ‘A’ to use yellow stick representation).

Open in new tab

Table 1

Types of annotation recognized by FeatureMap3D

Letter	Description	Color	Graphics
.	Null annotation	.	.
A	Active site	yellow	stick
N	N-glycosylation	red	spheres
O	O-glycosylation	purple	spheres
S	S-phosphorylation	cyan	spheres
T	T-phosphorylation	slate	spheres
Y	Y-phosphorylation	blue	spheres
U	Tyr-sulfation	orange	spheres
X	Generic PTM	white	stick
0	Custom backbone color	black	.
1	Custom backbone color	white/slate	.
2	Custom backbone color	red	.
3	Custom backbone color	cyan	.
4	Custom backbone color	purple	.
5	Custom backbone color	green	.
6	Custom backbone color	blue	.
7	Custom backbone color	yellow	.
8	Custom backbone color	orange	.
9	Custom backbone color	brown	.

Letter	Description	Color	Graphics
.	Null annotation	.	.
A	Active site	yellow	stick
N	N-glycosylation	red	spheres
O	O-glycosylation	purple	spheres
S	S-phosphorylation	cyan	spheres
T	T-phosphorylation	slate	spheres
Y	Y-phosphorylation	blue	spheres
U	Tyr-sulfation	orange	spheres
X	Generic PTM	white	stick
0	Custom backbone color	black	.
1	Custom backbone color	white/slate	.
2	Custom backbone color	red	.
3	Custom backbone color	cyan	.
4	Custom backbone color	purple	.
5	Custom backbone color	green	.
6	Custom backbone color	blue	.
7	Custom backbone color	yellow	.
8	Custom backbone color	orange	.
9	Custom backbone color	brown	.

In the Color and Graphics columns ‘.’ means ‘No effect’. The description of the type of annotation is only meant as a guide-line, and the annotation letters can be freely used as a mean for highlighting any kind of feature (e.g. disulfide bridges can be annotated with ‘X’ to mark the positions in white stick representation or ‘A’ to use yellow stick representation).

Open in new tab

Figure 1

Open in new tab Download slide

Visualizing the underlying exon structure of Columba livia Alpha-A globin. This figure illustrates the mapping of the protein sequence and underlying exon structure of Columba livia (domestic pigeon) Alpha-A globin onto the PDB structure 1A4F [ 17 ], which contains the Alpha/Beta-globin dimer from Anser indicus (bar-headed goose). ( A ) Shows the coloring scheme when the exon structure annotation is not used. Here the coloring reflect the alignment between the query protein sequence and the PDB protein sequence. Color key—green: perfect match; brown: mismatch (low significance); violet: mismatch (high significance); blue: sequence gap in query sequence; light gray; unmatched chain(s)—in the case the Beta-globin chain. ( B ) Shows the coloring when the exon structure is taken into account. Color key—slate-blue: exon 1, red: exon 2, cyan: exon 3, light gray: unmatched chains(s). ( C ) Shows the pairwise alignment generated by the FeatureMap3D server (here it has been color-coded to highlight both the underlying exon structure and the sequence homology). The topmost sequence is the query sequence (Alpha-D), the sequence below is the PDB hit sequence (1A4F, chain a). The line above the query sequence holds the (optional) annotation information (here the numbers 1, 2 and 3—indicating the exon number). The line below the PDB sequence contains DSSP secondary structure information (inferred from the PDB entry).

Figure 2

Open in new tab Download slide

Stereo view of the 3D structure of Aspergillus aculeatus rhamnogalacturonan acetylesterase ( 16 ) showing the localization in the structure of the annotated residues. The catalytic triad residues are shown in yellow stick representation, and the glycan structure bound at the two N-glycosylation sites is shown as red spheres ( A ). ( B ) The pairwise sequence alignment generated by the FeatureMap3D server. The catalytic residues are shown in yellow in the annotation line above the sequences and the two N-glycosylation sites are indicated in red.

Dikeos Mario Soumpasis, Ramneek Gupta and Søren Brunak are thanked for helpful discussions. This work is supported by a grant from The Danish National Research Foundation and The Danish Research Agency. Funding to pay the Open Access publication charges for this article was provided by The Danish Research Agency.

Conflict of interest statement . None declared.

REFERENCES

1

Cour, T., Kiemer, L., Mølgaard, A., Gupta, R., Skriver, K., Brunak, S.

2004

Analysis and prediction of leucine-rich nuclear export signals

Protein Eng. Des. Sel

.

17

527

–536

2

Julenius, K., Mølgaard, A., Gupta, R., Brunak, S.

2005

Prediction, conservation analysis and structural characterization of mammalian mucin-type O-glycosylation sites

Glycobiology

15

153

–164

3

Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.

2000

The Protein Data Bank

Nucleic Acids Res

.

28

235

–242

4

Dalal, S., Balasubramanian, S., Regan, L.

1997

Protein alchemy: changing beta-sheet into alpha-helix

Nature Struct. Biol

.

4

548

–552

5

Riesner, D.

2003

Biochemistry and structure of PrP(C) and PrP(Sc)

Br. Med. Bull

.

66

21

–33

6

Koehl, P. and Levitt, M.

2002

Sequence variations within protein families are linearly related to structural variations

J. Mol. Biol

.

323

551

–562

7

Glykos, N.M., Cesareni, G., Kokkinidis, M.

1999

Protein plasticity to the extreme: changing the topology of a 4-alpha-helical bundle with a single amino acid substitution

Structure

7

597

–603

8

Sinha, N. and Nussinov, R.

2001

Point mutations and sequence variability in proteins: redistributions of preexisting populations

Proc. Natl Acad. Sci. USA

98

3139

–3144

9

Altschul, S.F., Gish, S., Miller, W., Myers, E.W., Lipman, D.J.

1990

Basic local alignment search tool

J. Mol. Biol

.

215

403

–410

10

Altschul, F., Madden, T.L., Schaeffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.

1997

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Nucleic Acids Res

.

25

3389

–3402

11

Kabsch, W. and Sander, C.

1983

Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features

Biopolymers

22

2577

–2637

12

Li, W., Jaroszewski, L., Godzik, A.

2000

Sequence clustering strategies improve remote homology recognitions while reducing search times

Protein Eng

.

15

643

–649

13

Prlic, A., Down, T.A., Hubbard, T.J.P.

2005

Adding some SPICE to DAS

Bioinformatics

21

ii40

–ii41

14

Wernersson, R.

2006

Virtual Ribosome—a comprehensive DNA translation tool with support for integration of sequence feature annotation

Nucleic Acids Res

.

34

W385

–W388

15

Wernersson, R.

2005

FeatureExtract—extraction of sequence annotation made easy

Nucleic Acids Res

.

33

W567

–W569

16

Mølgaard, A., Kauppinen, S., Larsen, S.

2000

Rhamnogalacturonan acetylesterase elucidates the structure and function of a new family of hydrolases

Structure

8

373

–383

17

Zhang, J., Hua, Z., Tame, J.R., Lu, G., Zhang, R., Gu, X.

1996

The crystal structure of a high oxygen affinity species of haemoglobin (bar-headed goose haemoglobin in the oxy form)

J. Mol. Biol

.

255

484

–493

© The Author 2006. Published by Oxford University Press. All rights reserved  The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

Download all slides

Month:	Total Views:
January 2017	1
February 2017	1
March 2017	4
April 2017	3
June 2017	2
July 2017	3
August 2017	4
September 2017	5
October 2017	5
November 2017	7
December 2017	22
January 2018	18
February 2018	14
March 2018	23
April 2018	15
May 2018	19
June 2018	13
July 2018	11
August 2018	21
September 2018	15
October 2018	17
November 2018	19
December 2018	25
January 2019	15
February 2019	22
March 2019	22
April 2019	30
May 2019	39
June 2019	14
July 2019	24
August 2019	23
September 2019	13
October 2019	14
November 2019	19
December 2019	12
January 2020	4
February 2020	8
March 2020	13
April 2020	9
May 2020	15
June 2020	11
July 2020	22
August 2020	18
September 2020	8
October 2020	5
November 2020	13
December 2020	6
January 2021	3
February 2021	11
March 2021	35
April 2021	19
May 2021	25
June 2021	23
July 2021	16
August 2021	20
September 2021	11
October 2021	16
November 2021	18
December 2021	14
January 2022	11
February 2022	14
March 2022	23
April 2022	14
May 2022	17
June 2022	20
July 2022	14
August 2022	18
September 2022	7
October 2022	13
November 2022	14
December 2022	8
January 2023	9
February 2023	17
March 2023	10
April 2023	6
May 2023	5
June 2023	4
July 2023	3
August 2023	9
September 2023	14
October 2023	12
November 2023	13
December 2023	17
January 2024	21
February 2024	30
March 2024	10
April 2024	18

Article Contents

FeatureMap3D—a tool to map protein features and sequence conservation onto homologous structures in the PDB

Abstract

INTRODUCTION

SOFTWARE FEATURES

Basic functionality

Step 1: searching

Step 2: visualizing

Integrating annotation of protein features

EXAMPLES OF USE

Mapping sequence homology and the underlying exon structure

Mapping protein feature annotation onto a PDB structure

REFERENCES

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

FeatureMap3D—a tool to map protein features and sequence conservation onto homologous structures in the PDB

Abstract

INTRODUCTION

SOFTWARE FEATURES

Basic functionality

Step 1: searching

Step 2: visualizing

Integrating annotation of protein features

EXAMPLES OF USE

Mapping sequence homology and the underlying exon structure

Mapping protein feature annotation onto a PDB structure

REFERENCES

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only