The impact of splicing on protein domain architecture
Section snippets
Introduction — domain architectures and splicing
Protein domains are structural, functional and evolutionary building blocks that, within one protein, can form various architectures that may be composed of one or several domains [1]. Domains can often be defined either from a sequence similarity viewpoint as in the Pfam database [2], from an evolutionary perspective as in SCOP [3] or from a structural perspective as in CATH [4]. In many cases these definitions overlap [5].
Early in the genomic era studies showed that multidomain proteins are
Alternative splicing in the human proteome
In the early days of genomics, many different dedicated alternative splicing databases were produced. However, to the best of our knowledge hardly any of these have been consistently updated during the last few years, so today the best resources for studying alternative splicing are the more general databases: firstly, Ensembl [29] — a database that contains eukaryotic genomes; secondly, Vega/Havana 30, 31• — a resource for vertebrate genome annotation; thirdly, Unigene [32] — a transcriptome
Identification of functional isoforms
It came as a surprise for many when, in 2007, Tress and co-authors [25••] first showed that alternative splicing is even more common than previously thought. Further, the results indicated that for many of the alternative protein products, there is strong evidence suggesting that they encode nonfunctional proteins. Perhaps most strikingly, the authors suggested that it is unlikely that the ‘spectrum of conventional enzymatic or structural functions can be substantially extended through
How does alternative splicing affect the protein domain architecture?
After noting that, according to current consensus, only a small fraction of all alternatively spliced products result in functional proteins, it is obvious that it is crucial to correctly select the biologically relevant isoforms before performing an analysis of different splicing forms. Several different methods to limit the datasets have been explored. One approach is to use only conserved splice forms between, for instance, mouse and human.
In one of the first large scale studies of domains
Splicing and domain architecture for functional variation
There are some well studied examples where alternative splicing affects domain structure and clearly yields a domain architectural and/or phenotypic effect. Some of the best established examples of isoforms with domain architectural changes are associated with cancer such as for instance the epidermal growth factor receptor (EGFR), a transmembrane protein that belongs to the protein kinase family (Figure 3). This protein is, in various isoforms, overexpressed in many cancers [57]. The longest
Concluding remarks and future outlook
The main challenge for accurate assessment of the importance of alternative splicing for domain architectural changes is improved identification of functional isoforms at the protein level. As stated above, there are mainly two approaches that have been used to attempt to achieve this: use of evolutionarily conserved patterns or direct studies of the protein isoforms. Assuming that the recent observations of rapidly evolving changes in isoforms between species is correct 63•, 64• many isoforms
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
This work was supported by grants from the Swedish Research Council (VR-NT 2009-5072 and VR-M 2010-3555), SSF, the Foundation for Strategic Research, Science for Life Laboratory; the EU 7th Framework through the EDICT project, contract no: FP7-HEALTH-F4-2007-201924. Funding for SL was provided by BILS, Bioinformatics Infrastructure for Life Science.
References (67)
- et al.
Scop: a structural classification of proteins database for the investigation of sequences and structures
J Mol Biol
(1995) - et al.
Domain combinations in archaeal, eubacterial and eukaryotic proteomes
J Mol Biol
(2001) - et al.
Multi-domain proteins in the three kingdoms of like — orphan domains and other unassigned regions
J Mol Biol
(2005) - et al.
Quantification of the elevated rate of domain rearrangements in metazoa
J Mol Biol
(2007) - et al.
On the antiquity of introns
Cell
(1986) - et al.
Function of alternative splicing
Gene
(2013) - et al.
Verification of alternative splicing variants based on domain integrity truncation length and intrinsic protein disorder
Nucleic Acids Res
(2011) - et al.
Intrinsic disorder in cell-signaling and cancer-associated proteins
J Mol Biol
(2002) - et al.
The evolutionary landscape of alternative splicing in vertebrate species
Science
(2012) - et al.
Mapping intact protein isoforms in discovery mode using top-down proteomics
Nature
(2011)
MAISTAS: a tool for automatic structural evaluation of alternative splicing products
Bioinformatics
Chemical and biological evolution of a nucleotide-binding protein
Nature
Pfam: a comprehensive database of protein domain families based on seed alignments
Proteins: Struct Funct Genet
Cath — a hierarchical classification of protein domain structures
Structure
A comparison of sequence and structure protein domain families as a basis for structural genomics
Bioinformatics
Expansion of protein domain repeats
PLoS Comp Biol
Significant expansion of exon-bordering protein domains during animal proteome evolution
Nucleic Acids Res
Reassessing domain architecture evolution of metazoan proteins: major impact of gene prediction errors
Genes
Reassessing domain architecture evolution of metazoan proteins: major impact of errors caused by confusing paralogs and epaktologs
Genes
A survey on intron and exon lengths
Nucleic Acids Res
Origin and evolution of spliceosomal introns
Biol Direct
Why genes in pieces?
Nature
Initial sequencing and analysis of the human genome
Nature
Alternative splicing and genome complexity
Nat Genet
Different levels of alternative splicing among eukaryotes
Nucleic Acids Res
Expansion of the eukaryotic proteome by alternative splicing
Nature
Gencode: producing a reference annotation for encode
Genome Biol
Alternative splicing: current perspectives
Bioessays
Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing
Nat Genet
The implications of alternative splicing in the ENCODE protein complement
Proc Natl Acad Sci U S A
A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome
Science
Identifiability of isoform deconvolution from junction arrays and RNA-seq
Bioinformatics
Proteomics studies confirm the presence of alternative protein isoforms on a large scale
Genome Biol
Cited by (27)
Identification of transcriptional isoforms associated with survival in cancer patient
2019, Journal of Genetics and GenomicsCitation Excerpt :Different isoforms of the same gene may present different or even opposing biological functions (Li et al., 2014), as exemplified by the B cell lymphoma-x (Bcl-x) gene, which contains both the anti-apoptotic Bcl-xL and the pro-apoptotic Bcl-xS isoforms (Revil et al., 2007). Meanwhile, alternative splicing may exert its functional significance by modifying protein sequences, domains (Light and Elofsson, 2013), protein–protein interactions (Ellis et al., 2012), mRNA decay (Lewis et al., 2003; Baker and Parker, 2004), and protein translation process (Sanford et al., 2004). However, it remains unclear whether splicing isoforms would have different prognostic behaviors.
Evidence for splice transcript variants of TMEM165, a gene involved in CDG
2017, Biochimica et Biophysica Acta - General SubjectsCitation Excerpt :This difference in homodimerization could be in part explained by the differences in the conformation of the different isoforms. Recent findings indicate that smaller alternative splicing events, in particular in disordered regions, might be more prominent than domain architectural changes [25]. Moreover, as the SF variant is shorter than the others, this could contribute to the formation of a central ion-conducting pore via a dimerization process.
Functional innovation from changes in protein domains and their combinations
2016, Current Opinion in Structural BiologyCitation Excerpt :It is also worth noting that a single protein can dynamically switch its domain content through disorder to ordered transitions [15] and in a few cases between different ordered folds [53]. Furthermore, a single gene can give rise to multiple protein isoforms with altered MDAs through processes such as alternative-splicing [54], potentially causing large shifts in interaction partners [55]. However, the proportion of protein isoforms that are found in the proteome is unknown and many isoforms remain undetected in proteomics experiments, although homologous exon substitution events (which are likely to be important for modifying an individual domains function [56•]) are found to be relatively overrepresented at the proteomics level [57].
Regulators of carcinogenesis: Emerging roles beyond their primary functions
2015, Cancer LettersCitation Excerpt :Despite the activities of multiple factors including chaperones and covalent modifications that affect protein folding, sorting, and trafficking, it is the primary structure of the protein that determines its subcellular localization [108]. Therefore, compared with their conventionally spliced counterparts, variant forms of cancer regulators might control carcinogenesis via novel mechanisms [1,109]. Protein variants can originate from gene mutations, mRNA splicing, or alternative transcriptional or translational start sites [1,110,111].
The emerging era of genomic data integration for analyzing splice isoform function
2014, Trends in GeneticsCitation Excerpt :Expressed sequence tags or RNA-seq data are also informative sources for identifying splicing events [4,5]. Alternative splicing may have significant functional consequences because it can modify protein sequence/domains [6] and protein–protein interactions [7]. Functional consequences of alternative splicing also include alterations in mRNA decay [8] and the translation process [9].