Mapping the chemical and sequence space of the ShKT superfamily
Graphical abstract
Introduction
The Stichodactyla helianthus K channel toxin-like superfamily (ShKT) consists of molecules derived from both venomous and non-venomous species (Chang et al., 2018). These molecules have a range of functions and processes, from modulatory actions on potassium channels to roles in morphogenesis and cell differentiation (Gibbs et al., 2008; Prentis et al., 2018; Tsang et al., 2007; Tudor et al., 1996; Yan et al., 2000). ShKT domains are widespread throughout nature, occurring in sea anemones, reptiles, nematodes and mammals (Castañeda et al., 1995; Galea et al., 2014; Gibbs et al., 2006; Minagawa et al., 1998; Nguyen et al., 2013; Sunagar et al., 2012; Tsang et al., 2007). However, we know little of their evolution and the relationship between their chemical properties and function.
The ShKT superfamily contains two main families, the ShK-like proteins and the cysteine-rich domains (CRD) of cysteine-rich secretory proteins (CRISPs), which share a common fold (Fig. 1A). ShK-like domains occur as discrete single domains, multiple repeats or in combination with peptidase and other enzyme domains (Castañeda et al., 1995; Finn et al., 2014; Galea et al., 2014) (Fig. S1). The CRISPs have a more constrained organisation, with a ShKT domain linked by a flexible hinge to the pathogenesis-related 1 (PR1) domain (Gibbs et al., 2008). Despite their structural similarity, superfamily members have a documented range of functions, processes and taxonomic distribution.
The prototypical superfamily member is the sea anemone toxin ShK, a 35-residue peptide isolated from the sea anemone Stichodactyla helianthus (Castañeda et al., 1995). The structure of ShK is characterised by two α-helices (Fig. 1B) and three disulfide bonds (Cys1-Cys6, Cys2-Cys4 and Cys3-Cys5) (Tudor et al., 1996). CRISP sequences are larger polypeptides, characterised by 10–16 cysteines and molecular masses of 20–30 kDa, with distinct domains. The CRD region has a similar fold and cysteine connectivity to that of ShK (Fig. 1C) and is therefore thought to be homologous to ShK-like peptides, although their sequence identity is low (Guo et al., 2005).
Four main functional and biological processes have been attributed thus far to proteins in the ShKT superfamily: potassium and calcium channel modulation, involvement in morphogenesis pathways and cell regeneration. ShK, the prototypical peptide of the ShKT domain, is a potent KV1 channel blocker (Castañeda et al., 1995). An analogue of this peptide is in clinical trial for treatment of the autoimmune disease psoriasis (Chandy and Norton, 2017; Chi et al., 2012; Tarcha et al., 2017). Other ShK-like single domain peptides show activity across various KV channels, for example BgK (from Bundosoma granulifera) (KV1.1, 1.2, 1.3) and Oulactis sp. (OspTx2a) (KV1.2, 1.6) (Cotton et al., 1997; Sunanda et al., 2018). ShK-like sequences are also polyfunctional, with ShPI-1 (from Stichodactyla helianthus) and APEKTx1 (from Anthopleura elegantissima) being both Kunitz-type inhibitors with KV channel binding activity (García-Fernández et al., 2016; Peigneur et al., 2011). The precursor domain of NEP3 (from Nematostella edwardsia), a ShKT domain repeat protein, was shown to be neurotoxic to fish (Columbus-Shenkar et al., 2018; Moran et al., 2013). The function of the remaining ShKT domains in NEP3 are yet to be determined (Columbus-Shenkar et al., 2018).
The toxin-like domain (TxD) of the matrix metalloprotease MMP-23 has high structural similarity to ShK/BgK and the CRD region of CRISPs (Galea et al., 2014; Nguyen et al., 2013; Rangaraju et al., 2010). Functionally, MMP-23 may play a role in potassium channel trafficking to cell membranes (Galea et al., 2014). The TxD domain is a blocker of several KV channels (1.1, 0.3, 1.4, 1.6 and 3.2), although ineffective against others (KV 1.2, 1.5, 1.7 and KCa3.1) (Galea et al., 2014).
The importance of the ShKT domain in morphogenesis pathways is exemplified by Mab-7, a protein from the roundworm Caenorhabditis elegans (Tsang et al., 2007). The ShKT domain in Mab-7 is essential for protein binding, the protein itself being responsible for sensory ray morphogenesis, which is required for contact between the male and hermaphrodite during mating (Tsang et al., 2007).
CRISPs are found predominantly in vertebrates and are composed of four main functional groups. CRISP-1 is located in the reproductive tracts of mammals and CRISP-2 performs an autoantigen function in sperm. Functional studies of the mouse CRD domain of Tpx-1, a CRISP-2 member, show that it regulates the cardiac ryanodine receptor Ca2+ signalling, in line with its relatedness to the ion-channel modulators BgK and ShK (Gibbs et al., 2006). CRISP-3 is found in the venom of reptiles (snakes and lizards) and CRISP-4 is rodent-specific, where it is implicated in sperm interactions (Gibbs et al., 2008; Turunen et al., 2011). Several homologous CRISP-like sequences have been identified in insects (e.g. mosquitos, beetles and ants), although most of these proteins remain uncharacterised (Holt et al., 2002; Oxley et al., 2014; The Tribolium Genome Sequencing Consortium, 2008).
Cysteine-rich peptide sequences from nature comprise a valuable library of bioactive compounds. Their compact, stable structures permit a large variety of loop sequences to be tolerated, leading to high sequence and function diversity. However, this diversity typically precludes most traditional sequence analyses, such as phylogenetics (Inkpen and Doolittle, 2016; Pearson and Sierk, 2005; Rost, 1999). We therefore built quantitative maps based on sequence/chemical space to define the functional regions explored by the extant ShKs. This technique has been applied successfully to investigate the evolution of two superfamilies of defensins, sequences that are also cysteine-rich and highly sequence-diverse (Mitchell et al., 2019; Shafee et al., 2017; Shafee and Anderson, 2018), as well as for traditional globular proteins (Jackson et al., 2018). Sequences can be explicitly placed within a multidimensional sequence space based on a multiple sequence alignment (MSA) combined with the biophysical properties of their residues (Atchley et al., 2005). This multidimensional space can be summarised in a smaller number of dimensions that capture the important sequence properties. Clusters within such a space indicate sets of co-varying peptide chemical properties (Shafee and Anderson, 2018).
Here, we explore the biophysical determinants of these clusters and their relationship to function. This gives insights into the main groups of ShK-like sequences, and their evolution, while clearly delineating the CRISP CRD family within the ShKT superfamily. This analysis can also be used to guide exploration of the available ShKT sequences by identifying sequences from poorly characterised regions of chemical space.
Section snippets
Sequence data
Sequences deposited in the protein family database (Pfam - https://pfam.xfam.org/) under the ShKT clan (CL0213), CRISP (PF08562) and ShK-like (PF01549) were downloaded (Finn et al., 2014). Full-length sequences were examined, and the predicted mature peptide retained for parsing through to the alignments and chemical space analysis. Sequences also included two functionally characterised ShK-like sequences isolated from the tentacle transcriptomes of the sea anemone Oulactis sp., U-AITX-Oulsp1
Sequence retrieval and data preparation
A total of 1082 sequences, including single and multiple domains, was subject to analysis after division into their ShK and CRD domain components. The analysis included 50 new sequences from the speckled anemone (Oulactis sp.) transcriptome containing 90 ShK-like domains. Many of these sequences contained multiple ShK-like repeat domains, up to nine repeats in one sequence, or were in multi-domain proteins associated with peptidases (e.g. astacin) or protease inhibitors (e.g. Kunitz-type) (Fig.
Discussion
The ShKT fold is an ancient scaffold found in cnidarians (∼700 mya), nematodes, mammals and toxicoferan-reptiles (∼65 mya) in the form of ShK-like proteins and the CRD region of CRISPs (Chhabra et al., 2014; Fautin, 1998; Gibbs et al., 2008, 2006; Sunagar et al., 2012; Tudor et al., 1996). Within this constrained disulfide structure, sequence divergence is exceptionally high. However, the sequences are clearly not random; otherwise, the sequence space would show a spherical cloud. The clusters
Ethical statement
Ethical statement On behalf of, and having obtained permission from all the authors, I declare that:
- (a)
the material has not been published in whole or in part elsewhere;
- (b)
the paper is not currently being considered for publication elsewhere;
- (c)
all authors have been personally and actively involved in work leading to the review. All authors have read the manuscript and agree to its publication in Toxicon.
Conflict of interest
The authors declare there are no conflicts of interest.
Acknowledgment
The authors acknowledge Dr. Rodrigo A. V. Morales and Edward Airey for their contribution to the preparation of sequences used in the analysis. This project was funded in part by ARC linkage grant LP150100621. M.L.M acknowledges an Australian Government Research Training Program Scholarship, Monash Medicinal Chemistry Faculty Scholarship and Monash University-Museum Victoria Scholarship top-up. R.S.N acknowledges fellowship support from the Australian National Health and Medical Research Council
References (64)
- et al.
Characterization of a potassium channel toxin from the Caribbean sea anemone Stichodactyla helianthus
Toxicon
(1995) - et al.
Peptide blockers of KV1.3 channels in T cells as therapeutics for autoimmune disease
Curr. Opin. Chem. Biol.
(2017) - et al.
Development of a sea anemone toxin as an immunomodulator for therapy of autoimmune diseases
Toxicon
(2012) - et al.
The cysteine-rich secretory protein domain of Tpx-1 is related to ion channel toxins and regulates ryanodine receptor Ca2+ signaling
J. Biol. Chem.
(2006) - et al.
Crystal structure of the cysteine-rich secretory protein stecrisp reveals that the cysteine-rich domain has a K+ channel inhibitor-like fold
J. Biol. Chem.
(2005) - et al.
ShK-Dap22, a potent KV1.3-specific immunosuppressive polypeptide
J. Biol. Chem.
(1998) - et al.
Structure, folding and stability of a minimal homologue from Anemonia sulcata of the sea anemone potassium channel blocker ShK
Peptides
(2018) - et al.
Synthesis, folding, structure and activity of a predicted peptide from the sea anemone Oulactis sp. with an ShKT fold
Toxicon
(2018) - et al.
A simple method for displaying the hydropathic character of a protein
J. Mol. Biol.
(1982) - et al.
Primary structure of a potassium channel toxin from the sea anemone Actinia equina
(1998)
Intracellular trafficking of the KV1.3 potassium channel is regulated by the prodomain of a matrix metalloprotease
J. Biol. Chem.
Aurelin, a novel antimicrobial peptide from jellyfish Aurelia aurita with structural features of defensins and channel-blocking toxins
Biochem. Biophys. Res. Commun.
The genome of the clonal raider ant Cerapachys biroi
Curr. Biol.
The limits of protein sequence comparison?
Curr. Opin. Struct. Biol.
A bifunctional sea anemone peptide with Kunitz type protease and potassium channel inhibiting properties
Biochem. Pharmacol.
Potassium channel modulation by a toxin domain in matrix metalloprotease 23
J. Biol. Chem.
Crystal structure of a CRISP family Ca2+ channel blocker derived from snake venom
J. Mol. Biol.
mab-7 encodes a novel transmembrane protein that orchestrates sensory ray morphogenesis in C. elegans
Dev. Biol.
Solving the protein sequence metric problem
Proc. Natl. Acad. Sci. Unit. States Am.
The protein data bank
Nucleic Acids Res.
TOP-IDP-Scale: a new amino acid scale measuring propensity for intrinsic disorder
Protein Pept. Lett.
ShK toxin: history, structure and therapeutic applications for autoimmune diseases
WikiJ. Sci.
KV1.3 channel-blocking immunomodulatory peptides from parasitic worms: implications for autoimmune diseases
FASEB J.
Dynamics of venom composition across a complex life cycle
Elife
A potassium-channel toxin from the sea anemone Bunodosoma granulifera, an inhibitor for KV1 channels - revision of the amino acid sequence, disulfide-bridge assignment, chemical synthesis, and biological activity
Eur. J. Biochem.
WebLogo: a sequence logo generator
Genome Res.
The igraph software package for complex network research
InterJournal Complex Syst.
A centipede toxin family defines an ancient class of CSαβ defensins
Structure
Pymol: an open-source molecular graphics tool
CCP4 Newsl. Protein Crystallogr.
R: A Language and Environment for Statistical Computing
Cnidaria
Pfam: the protein families database
Nucleic Acids Res.
Cited by (11)
The voltage-gated potassium channel K<inf>V</inf>1.3 as a therapeutic target for venom-derived peptides
2020, Biochemical PharmacologyCitation Excerpt :This multidimensional space can in turn be summarised in a smaller number of dimensions that capture the important sequence properties; clusters within such a space indicate sets of co-varying peptide chemical properties [216]. As an example, we have applied this approach to the ShKT superfamily [219]. ShKT data were sourced from published sequences in the protein family database, as well as new ShK-like sequences from the Australian speckled anemone (Oulactis sp.) [220].
Structure–function relationships in ShKT domain peptides: ShKT-Ts1 from the sea anemone Telmatactis stephensoni
2024, Proteins: Structure, Function and BioinformaticsTranscriptome Sequencing of the Pale Anemones (Exaiptasia diaphana) Revealed Functional Peptide Gene Resources of Sea Anemone
2022, Frontiers in Marine Science