Differential Evolution and Neofunctionalization of Snake Venom Metalloprotease Domains

Snake venom metalloproteases (SVMP) are composed of five domains: signal peptide, propeptide, metalloprotease, disintegrin, and cysteine-rich. Secreted toxins are typically combinatorial variations of the latter three domains. The SVMP-encoding genes of Psammophis mossambicus venom are unique in containing only the signal and propeptide domains. We show that the Psammophis SVMP propeptide evolves rapidly and is subject to a high degree of positive selection. Unlike Psammophis , some species of Echis express both the typical multidomain and the unusual monodomain (propeptide only) SVMP, with the result that a lower level of variation is exerted upon the latter. We showed that most mutations in the multidomain Echis SVMP occurred in the protease domain responsible for proteolytic and hemorrhagic activities. The cysteine-rich and disintegrin-like domains, which are putatively responsible for making the P-III SVMPs more potent than the P-I and P-II forms, accumulate the remaining variation. Thus, the binding sites on the molecule’s surface are evolving rapidly whereas the core remains relatively conserved. Bioassays conducted on two post-translationally cleaved novel proline-rich peptides from the P. mossambicus propeptide domain showed them to have been neo-functionalized for specific inhibition of mammalian a7 neuronal nicotinic acetylcholine receptors. We show that the proline rich postsynaptic specific neurotoxic peptides from Azemiops feae are the result of convergent evolution within the precursor region of the C-type natriuretic peptide instead of the SVMP. The results of this study reinforce the value of studying obscure venoms for biodiscovery of novel investigational ligands.

Snake venom metalloproteases (SVMP) 1 evolved from ADAM (A disintegrin and metalloprotease) proteins that were recruited into the venom of snakes near the base of the advanced snake (Caenophidia) radiation. They have been identified in the venoms of all lineages of advanced snakes (1)(2)(3)(4)(5). The ancestral SVMP (P-III) contains (in downstream order) five domains: signal ϩ propeptide ϩ metalloprotease ϩ disintegrin ϩ cysteine rich. Following the divergence of vipers from the remaining caenophidians, extensive gene duplication, domain loss, and positive selection resulted in generation of the P-I and P-II classes of SVMP within the viperid lineage (6 -8). These two derived classes found in viper venoms lack either the cysteine-rich domain (P-II) or the cysteinerich and disintegrin domains (P-I) (8,9). The signal peptide and the propeptide domain are typically cleaved off before expression although the latter has been detected in venoms on occasion (9). SVMP are often the dominant venom component in the venom of viperid snakes, but are typically much less significant in the venom of other snake families (10 -15). The majority of SVMP principally exhibit hemorrhagic activity, although other functions, such as the activation of prothrombin and Factor X, fibrin(ogen)olysis, apoptosis and the inhibition of platelet aggregation, have also been reported (16). Although SVMP-induced hemorrhage is primarily dependent on the proteolytic activity of the metalloprotease domain, the potency of this activity is increased by the presence of the additional domain structures that are absent from the P-I and P-II class (17). Consequently, P-III SVMP typically exhibits the greatest hemorrhagic activities. SVMP represent a model system for investigating the evolutionary processes responsible for generating new protein functions. Extensive gene duplication and domain loss has resulted in the generation of a large multilocus gene family that encodes related proteins exhibiting divergent molecular structures (7,18). Additional diversity has arisen because of accelerated evolution within new SVMP classes following the loss of domains (7,19). Complicating matters is evidence that some SVMP genes are also capable of selectively expressing specific domains (2,7,20); for example, some genes encode" short-coding disintegrins," which consist solely of a signal peptide and a disintegrin domain (20). There have also been two reports of P-III SVMP that encode only the propeptidedomain of the SVMP gene. Truncated SVMPs that terminate at the end of the prodomain have been transcriptomically identified from the viper genus Echis (7,11), although the proteins encoded by these genes have yet to be detected proteomically in venom. Notably, Echis venoms contain high levels of SVMP, with representatives of all three SVMP classes present. This prevalence of SVMP in the venom of these snakes is thought to be largely responsible for inducing the severe hemorrhaging observed in envenomated prey (11,12,21). SVMP found in the venom of the lamprophiid snake Psammophis mossambicus, on the other hand, consist entirely of selectively expressed propeptide domains, which have evolved via deletion of the metalloprotease, disintegrin, and SVMP domains from the ancestral multidomain P-III SVMP gene (2). Despite this fascinating observation, the evolution and bioactivity of these atypical SVMP remain completely unexplored. Mass spectrometry of P. mossambicus venom revealed an abundance and diversity of peptides with molecular weights consistent to post-translational proteolytically liberated peptides from the propetide precursor region (22).
Here we investigate the evolution of pro-domain SVMP genes in Psammophis and compare the rate of positive selection acting on these genes to propeptide and P-III SVMP genes isolated from Echis venom. We reveal that Psammophis SVMP have accumulated significantly higher numbers of positively selected sites in the propeptide domain than observed in Echis. We also demonstrate that positive selection pressures acting on the truncated structure of Psammophis SVMPs are directly responsible for driving protein neofunctionalization in the form of novel neurotoxic activity. We used molecular phylogenetics to determine if monodomain propeptide expression in Echis shared an evolutionary history with Psammophis, and is thus ancestral to all advanced snakes, or if these were convergent derivations. In addition, it was recently shown that a unique type of proline-rich peptide isolated from the venom of the viperid snake Azemiops feae is neurotoxic like the P. mossambicus Pm1 and Pm2 peptides. The A. feae peptide differs in activity by blocking the neuromuscular nicotinic acetylcholine receptor rather than the neuronal receptor targeted by the peptides in this study (22). Before this study, the molecular evolutionary history of the A. feae peptide remained to be elucidated, as it was known only from the 21-residue post-translationally processed form se-creted in crude venom. Thus, we sequenced the full coding region for this neurotoxin from the mRNA of A. feae venom glands to determine whether the proline-rich neurotoxins from Psammophis and Azemiops shared a molecular evolutionary history or if they were convergently-derived.

MATERIALS AND METHODS
Sequence Retrieval and Alignment-Psammophis mossambicus propeptide monodomain, Echis spp propeptide monodomain (E. coloratus and E. pyramidum leakeyi) and E. coloratus multidomain SVMP nucleotide sequences were recovered bioinformatically from the National Center for Biotechnology Information (NCBI: http://www. ncbi.nlm.nih.gov/). The translated nucleotide sequences were aligned using PRANK (24) and then adjusted manually to optimize the alignments. To avoid confusion, previously obtained sequences are given with their UniProt accession numbers whereas ones obtained in this study are given with their Genbank accession numbers.
cDNA Library Construction-Venom glands of an Azemiops feae specimen from Hunan, China were dissected under surgical anesthesia 3 days after stimulation by milking. Total RNA was extracted using the standard TRIzol Plus method (Invitrogen, Carlsbad, CA). Extracts were enriched for mRNA using an RNeasy mRNA mini kit (Qiagen, Valencia, CA). mRNA was reverse transcribed, fragmented and ligated to a unique 10-base multiplex identifier tag prepared using standard protocols and applied to one PicoTitrePlate (PTP) for simultaneous amplification and sequencing on a Roche 454 GS FLXϩ Titanium platform (Australian Genome Research Facility). Automated grouping and analysis of sample-specific multiplex identifier reads informatically separated sequences from the other transcriptomes on the plates, which were then post-processed to remove low quality sequences before de novo assembly into contiguous sequences (contigs) using MIRA software (ref.). Assembled contigs were processed using CLC Main Work Bench (CLC-Bio) and the Blast2GO bioinformatic suite (47)(48)(49)(50) to provide Gene Ontology, BLAST, and domain and Interpro annotation. The above analyses assisted in rationalization of the large numbers of assembled contigs into phylogenetic "groups" for detailed phylogenetic analyses outlined below.
Test for Recombination-Recombination can mislead phylogenetic and evolutionary selection interpretations (25). Hence, we evaluated the effect of recombination on Psammophis mossambicus and Echis spp. SVMPs by employing single breakpoint recombination implemented in the HyPhy package (26 -28). Potential breakpoints were detected using the small sample Akaike information criterion (AIC) and the sequences were compartmentalized before conducting the selection analyses.
Phylogenetic Reconstruction-The best-fit model of nucleotide substitution for each data set was determined using jModeltest (29), according to AIC. Model-averaged parameter estimates of gamma shape parameter (␣) and the proportion of invariant sites (pinvar) were used for phylogenetic reconstruction. Phylogenetic relationships were determined using Bayesian and maximum-likelihood approaches. MrBayes version 3.1 (30,31) was used for Bayesian inference. Tree searches were run using four Markov chains for 10 million generations, sampling every 100th tree. The log likelihood score of each saved tree was plotted against the number of generations to establish the point at which the log-likelihood scores of the analyses reached their asymptote. Twenty-five percent of the total trees sampled were discarded as burn in. The posterior probabilities for clades were established by constructing a majority rule consensus tree for all trees generated after completion of the burn in. The analyses were repeated three times to ensure that the trees generated were not clustered around local optima. An optimal maximum likelihood phy-logenetic tree was obtained using PhyML 3.0 (32) and node support was evaluated with 1000 bootstrapping replicates.
Selection Analyses-We employed sophisticated likelihood models of coding-sequence evolution (33,34) as implemented in CODEML of the PAML (35) package to estimate the selection pressures shaping the E. coloratus multidomain, Echis spp and P. mossambicus monodomain SVMPs. We first employed the lineage-specific oneratio model that assumes a single for the entire phylogenetic tree. The one-ratio model is very conservative and can only detect positive selection if the ratio averaged over all the sites along the lineage is significantly greater than one.
The assumption of constant evolutionary selection pressure for the entire phylogenetic tree over millions of years is unrealistic. Thus, lineage-specific models like the one-ratio model fail to identify regions in proteins that might accumulate variation more often than others and hence they can underestimate the strength of selection. We therefore employed site-specific models that estimate positive selection statistically as a nonsynonymous-to-synonymous nucleotidesubstitution rate ratio () significantly greater than 1. We compared likelihood values for three pairs of models with different assumed distributions as no a priori expectation exists for the same: M0 (constant rates across all sites) versus M3 (allows to vary across sites within "n" discrete categories, n Ն 3); M1a (a model of neutral evolution) in which all sites are assumed to be either under negative ( Ͻ1) or neutral selection ( ϭ 1) versus M2a (a model of positive selection), which in addition to the site classes mentioned for M1a assumes a third category of sites; sites with Ͼ1 (positive selection) and, finally, M7 (␤) versus M8 (␤ and ); models that mirror the evolutionary constraints of M1 and M2 but assumes that values are drawn from a ␤ distribution (36). Only if the alternative models (M3, M2a and M8: allow sites with Ͼ1) show a better fit in the likelihood ratio test relative to their null models (M0, M1a, and M8: do not show allow sites Ͼ1), are their results considered significant. The likelihood ratio test is estimated as twice the difference in maximum likelihood values between nested models and compared with the 2 distribution with the appropriate degree of freedom-the difference in the number of parameters between the two models. The Bayes empirical Bayes approach (37) was used to identify amino acids under positive selection by calculating the posterior probabilities that a particular amino acid belongs to a given selection class (neutral, conserved, or highly variable). Sites with greater posterior probability (PP Ն 95%) of belonging to the "w Ͼ 1 class" were inferred to be positively selected.
We employed Single Likelihood Ancestor Counting, Fixed-Effects Likelihood, and Random Effects Likelihood models (38) implemented in HyPhy (39) to provide significant support to the aforementioned analyses and to detect sites evolving under the influence of positive and negative selection. The more advanced Mixed Effects Model Evolution (40) was also used to detect episodic diversifying selection. Mixed Effects Model Evolution employs Fixed-Effects Likelihoodalong the sites and Random-effects likelihood across the branches to detect episodic diversifying selection. For clear depiction of the proportion of sites under selection, an evolutionary fingerprint analysis was carried out using the evolutionary selection distance (ESD) algorithm (41) implemented in Datamonkey.
The direct comparison of omega values computed using the aforementioned methods can be misleading as different proportions of sites may be under selection. Hence, we partitioned the Echis coloratus SVMP domains and computed omega values simultaneously using Mgene (4) and option G test (42) from Codeml to assess the selection pressures on various SVMP domains.
Structural Analyses-To depict the differential selection pressures on various domains of E. coloratus SVMP, we constructed a homology model using the Phyre 2 webserver (43) and mapped the sites under positive selection using Pymol (44). The crystal structure of 2E3X:A was selected as the best-fit template for the target sequence GU012165.1 for homology modeling. The program GETAREA (45) was used to calculate the accessible surface area (ASA) (i.e., solvent exposure) of amino acid side chains. It uses the atom coordinates of the PDB file and indicates if a residue is buried or exposed to the surrounding medium by comparing the ratio between side-chain ASA and the "random coil" values per residue. An amino acid is considered to be buried if it has an ASA less than 20% and exposed if ASA Ն 50%. The Consurf webserver was used for mapping the evolutionary selection pressures on the three-dimensional homology model of E. coloratus SVMPs (46).
P. mossambicus propeptide SVMPs Pm1 and Pm2 were synthesized on a Protein Technology (Symphony) automated peptide synthesizer using Fmoc-Arg(Pbf)-Wang resin (0.1 mmol). Assembly of the peptides was performed using 2-(1H-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate /diisopropylethylamine in situ activation protocols (Schnolzer et al., 2007) to couple the Fmocprotected amino acid to the resin (5 equiv. excess, coupling time 20 min). Fmoc deprotection was performed with 30% piperidine/dimethylformamide for 1 min followed by a 2 min repeat. Washes were performed 10 times after each coupling as well as after each deprotection step. After chain assembly and final Fmoc deprotection the peptide resins were washed with methanol and dichloromethane and dried in a stream of nitrogen. Cleavage of peptide from the resin was performed at room temperature in TFA:H 2 O:triisopropyl silane:Ethane dithiol (87.5:5:5:2.5) for 3 h. Cold diethyl ether (30 ml) was then added to the filtered cleavage mixture and the peptide precipitated. The precipitate was collected by centrifugation and subsequently washed with further cold diethyl ether to remove scavengers. The final product was dissolved in 50% acetonitrile and lyophilized to yield a white solid product. The crude, reduced peptide was examined by reversedphase HPLC for purity and the correct molecular weight confirmed by electrospray mass spectrometry.
Analytical HPLC runs were performed using a Shimadzu HPLC system LC10A with a dual wavelength UV detector set at 214 nm and 254 nm. A reversed-phase C 18 column (Zorbax 300-SB C-18; 4.6 ϫ 50 mm) with a flow rate of 2 ml/min was used. Elution was performed using a 0 -80% gradient of Buffer B (0.043% TFA in 90% acetonitrile) in Buffer A (0.05% TFA in water) over 20 min. Crude peptides were purified by semipreparative HPLC on a Shimadzu HPLC system LC8A with a reversed-phase C 18 column (Vydac C-18, 25 cm ϫ 10 mm) running at a flow rate of 5 ml/min with a 1%/min gradient of 5-50% Buffer B. The purity of the final product was evaluated by analytical HPLC (Zorbax 300SB C18: 4.6 ϫ 100 mm) with a flow rate of 1 ml/min and a 1.67%/min gradient of Buffer B (5-45%). The final purity of all synthesized peptides was Ͼ95%. Electrospray mass spectra were collected inline during analytical HPLC runs on an Applied Biosystems API-150 spectrometer operating in the positive ion mode with an OR of 20, Rng of 220, and Turbospray of 350 degrees. Masses between 300 and 2200 amu were detected (Step 0.2 amu, Dwell 0.3 ms).
SH Raw fluorescence readings were converted to response over baseline using the FLIPRTetra software SCREENWORKS 3.1.1.4 (Molecular Devices) and were expressed relative to the maximum increase in fluorescence of control responses.
Male chicks (4 -10 days) were killed by CO 2 and exsanguination. Both chick biventer cervicis nerve muscle preparations were isolated and mounted on wire tissue holders under 1 g resting tension in 5 ml organ baths containing Krebs solution (NaCl, 118.4 mM; KCl, 4.7 mM; MgSO 4 , 1.2 mM KH 2 PO 4 , 1.2 mM; CaCl 2 , 2.5 mM; NaHCO 3 , 25 mM and glucose, 11.1 mM), maintained at 34°C and bubbled with 95% O 2 /5% CO 2 . Indirect twitches were evoked by electrical stimulation of the motor nerve (supramaximal voltage, 0.2 ms, 0.1 Hz) using a Grass S88 stimulator (Grass Instruments, Quincy, MA). d-Tubocurarine (10 M) was added, and subsequent abolition of twitches confirmed selective stimulation of the motor nerve, after which thorough washing with Krebs solution was applied to re-establish twitches. In the absence of electrical stimulation, contractile responses to acetylcholine (ACh; 1 mM for 30 s), carbachol (CCh; 20 M for 60 s), and potassium (KCl; 40 mM for 30 s) were obtained before the addition of peptide and at the conclusion of the experiment. The preparation was equilibrated for 30 min before the addition of peptide. Peptides were left in contact with the preparation for a maximum of 3 h to test for slow-developing effects.
Molecular Evolution-Phylogenetic analyses showed that the prepro-only expression in Psammophis and Echis are convergent derivations (Fig. 1). Sequence alignment showed extreme variation in the Psammophis monodomain form whereas the Echis monodomain prepro form was almost identical to the prepro region expressed as part of the multidomain gene (Fig. 2).
Selection Analyses-By using the one-ratio model, the simplest of the lineage-specific models that computes a single value for all branches in the phylogeny, the global was estimated to be 0.94, 0.39, and 0.39 for the Psammophis monodomain, Echis monodomain, and Echis coloratus multidomain SVMP genes, respectively (supplemental Table S2). Because this value is an average over all codons in each lineage, it suggests a rapid accumulation of mutations in the Psammophis SVMP propeptide domain. In contrast, the Echis propeptide domains seem to be under negative selection. The global estimates for the Echis SVMP protease, disintegrin-like and cysteine-rich domains were 1.33, 0.92, and 0.97, respectively (supplemental Table S2). This highlights the strong influence of positive selection on the protease domain that is responsible for hemorrhagic activity.
The Bayes empirical Bayes approach implemented in site-model 8 estimated that about 8.5% of sites were under positive selection in the P. mossambicus monodomain propeptide region ( ϭ 1.23) whereas the Echis monodomain ( ϭ 0.52) and E. coloratus multidomain ( ϭ 0.46) SVMPs evolve under negative selection ( Fig. 3; Table  I). Omega estimates for the protease, disintegrin-like, and cysteinerich domains of E. coloratus SVMP under this approach were 1.60 (32% of sites), 1.33 (25% of sites), and 1.39 (21% of sites), respectively, highlighting the strong influence of positive selection on the evolution of these domains (Table II and Fig. 4).
Direct comparison of values computed using the aforementioned algorithms could potentially be misleading as these genes may have a different proportion of sites under selection. Hence, for a better comparison of selection pressures acting on different domains of the E. coloratus multidomain SVMP, we employed Mgene with option G analysis of Codeml. This test simultaneously estimated of 0.38, 1.28, 0.91, and 1.04 for the propeptide, protease, disintegrin-like and cysteine-rich domains, respectively (Table II). Moreover, the sitespecific model 8 analyses using the Bayes empirical Bayes approach identified 28 (ϳ53% of positively selected sites), 9 (17%) and 15 (ϳ29%) amino acid sites under positive selection in the protease, disintegrin-like and cysteine-rich domains, respectively, confirming the greater influence of positive selection on the protease domain than any other domain (Table II and Fig. 4). In contrast to the Echis protease domain and the Psammophis monodomain propetide region, the E. coloratus multidomain and the Echis spp monodomain SVMP propeptide regions have evolved under negative selection ( ϭ 0.38 and ϭ 0.52, respectively). We mapped positively selected sites onto the homology model of E ocellatus multidomain SVMP to clearly depict their location in the toxin (Fig. 4). Our analyses show that almost 60% of these positively selected sites are confined to the molecular surface, whereas only 12% are buried in the conserved core of the toxin.
Bioactivity Testing-Assessment of the pharmacological activity of two Psammophis mossambicus proline-rich SVMP propeptideonly domain variants: Pm1 (VYNLHGSVPAPPWQPHARRPRPKNR encoded by uniprot A7X4A6) and Pm2 (VYNLHGSVPAP-PWQPHARRPRPKYR encoded by uniprot A7X461) using highthroughput FLIPR assays revealed that these peptides did not exhibit any agonist-like activity at concentrations up to 1 mM, and were inactive at L-type and N-type Ca v , Na v and a3b2/4 nAChR (data not shown). However, both peptides caused concentration-dependent inhibition of endogenously expressed human ␣7 nAChR, with IC50s of 11.99 M and 11.83 M (pIC 50 4.921 Ϯ 0.058 and 4.927 Ϯ 0.094), respectively (Fig. 5). Psammophis mossambicus SVMP propeptide only domain variants Pm1 and Pm2 had no significant inhibitory effect on twitch height in the chick biventer cervicis nerve-muscle preparation (n ϭ 4, data not shown) at concentrations up to 10 M over a period of 3 h.
Convergent Derivation of Proline Rich Neurotoxic Peptides-It was revealed that the A. feae proline-rich peptides were not derivations of the SVMP propeptide region but rather the propeptide region of the c-type natriuretic peptide expressed in snake venoms. The A. feae peptide was encoded in tandem repeats, with some forms containing two repeats of the proline-rich neurotoxic domain whereas other transcripts contain three repeats (genbank accession numbers JX467171 and JX467172). Intriguingly, sequence alignment reveals that this region is distinct from the domain that encodes the bradykinin potentiating peptides in the CNP precursor (Fig. 6).

Snake Venom Metalloprotease Domains
To avoid confusion, previously obtained sequences are given with their UniProt accession numbers whereas ones obtained in this study are given with their Genbank accession numbers (Table S1).

DISCUSSION
Chelation of the Zn2ϩ ion present in proteases inhibits the proteolytic effect of these enzymes. Although it is unclear which domain of multidomain SVMP is responsible for the hemorrhagic effect of this toxin, chelation of the Zn2ϩ ion in the protease domain inhibits this activity. It is likely, therefore, that the protease domain of multidomain SVMP is responsible for the hemorrhaging often observed in envenomations by snakes with venoms rich in this toxin type (54,55). Moreover, it is hypothesized that the presence of the cysteine-rich and dis-integrin-like domains results in the increased potency of some forms of SVMP (P-III) in comparison with those that lack them (P-I and P-II). We show that more than half of the positively selected sites detected were confined to the protease domain whereas the remaining variations were shared between the cysteine-rich and disintegrin-like domains, suggesting that these domains play a prominent role in SVMP-induced inflammatory reactions. It has been shown in the past that venom components accumulate mutations in functional regions, on the molecular surface and in the tips of loops of the toxins while preserving the ancient scaffold that provides structural stability (56 -59). Most venom novelties are derived from mutations of the protein surface residues, which not only increase the num- ber target sites in the prey for these toxins but could also aid in avoiding the host immune response. We have shown that snake venom metalloproteases exhibit a similar phenomenon with 60% of all positively selected residues being confined to the molecular surface of the toxin whereas only 12% were buried (the remaining 29% could not be assigned into exposed or buried classes (Fig. 4). Moreover, 37 cysteine-residues remain unmodified, which is indicative of the importance of cysteine residues in stabilization of venom proteins (59).
P. mossambicus SVMP consist solely of the propeptide domains, with the additional domains typically found in SVMP having been lost from the ancestral scaffold. Our results demonstrate that the loss of these domains has exposed the propeptide domains to novel evolutionary selection pressures (Table I; Figs. 3 and 4). Importantly, as a logical consequence of the partial loss of weaponry, an increased selection pressure is applied, driving a rapid rate of mutations. Driven by positive selection, this has resulted in neofunctionalization of these venom components. We report here the genesis of a novel postsynaptic neurotoxic activity through inhibition of postsynaptic ␣7 nicotinic acetylcholine receptors by some of the Psammophis propeptide SVMP forms (uniprot accession numbers A7X4A6 and A7X461). Given the rapid rate of mutations and strong evolutionary selection pressures, the possi-bility of participation by the other Psammophis SVMP isoforms in envenoming through additional novel mechanisms cannot be ruled out. In contrast to Psammophis, E. coloratus expresses all the SVMP domains specialized for envenoming (protease, cysteine-rich, and disintegrin-like domains -i.e. P-III SVMP). Consequently, the propeptide region, which is post-translationally excised from the final toxin, exhibits no evidence of adaptive evolution ( ϭ 0.46), suggesting that it lacks a significant role in envenoming (Table I; Figs. 3 and 4). Some species of Echis not only selectively express the propeptide region of the SVMP but also have venom genes encoding highly lethal multidomain SVMP. Hence, they do not exhibit any variation in the propeptide-only toxins, which evolve under the influence of negative selection ( ϭ 0.52).
It has previously been hypothesized that snake toxins often consist of the smallest functional domain of a large multidomain protein, which acts as a provision for innovations (60). In contrast to this, we show that domains that contribute toward envenoming are still capable of accumulating rapid mutations under the influence of positive selection. Evaluation of selection pressures on different domains of Echis SVMP revealed that most of the mutations were directed toward the protease domain (ϳ54%; ϭ 1.28), highlighting the importance of variation in this region believed to be responsible for inducing    amino-acids (exposed or buried) in the crystal structure of Echis coloratus SVMP is presented. Residues with an ASA ratio of more than 50% (above blue line) are considered to be exposed to the surrounding solvent whereas those with a ratio lesser than 20% (bellow the red line) are considered to be buried. The co-ordinates of positively selected sites in the protease, disintegrin-like and cysteine-rich domains are shown as big green, blue, and orange circles, respectively. Three-dimensional structures of each SVMP domain depicting the locations of positively selected sites (in red) along with the model 8 omega are also presented. hemorrhage (Table II and Fig. 4). We speculate that such mutations may be important for sub-functionalization or increasing the potency of these proteins, or perhaps act as a weapon in the co-evolutionary arms race against prey resistance (61,62). A major proportion of the remaining variations in Echis SVMP were detected in the cysteine-rich and disintegrin-like domains (ϳ29%, ϭ 1.04 and 17%, ϭ 0.91, respectively).
The fact that 60% of all positively selected sites (and associated variations) occur on the molecular surface whereas only 12% correspond to buried residues highlights the importance of changes in the surface chemistry (Fig. 4). Accumulation of surface mutations may facilitate interaction with novel targets and may also aid in avoidance of the host's immune response. The complete conservation of the 37 ancestral cysteine residues highlights the prominent role that disulfide bridges play in stabilizing and structurally scaffolding venom peptides and proteins.
Evidence provided by various analyses [Site-specific model 2a, model 3, model 8, Single Likelihood Ancestor Counting, Fixed-Effects Likelihood, Random-effects likelihood, MEM (Table II and supplemental Table II); Evolutionary fingerprint analyses (supplemental Fig. S1) and Mgene with option G test (Table II) highlights the strong influence of positive Darwinian selection in shaping the various domains of E. coloratus multidomain SVMP responsible for toxicity (protease, disintegrinlike and cysteine-rich). The propeptide domain, which is not known to play a significant role in envenoming, was subject to negative selection. This region is excised from the final protein as part of the post-translational modification and never forms part of the lethal multidomain SVMP toxin. In contrast, P. mossambicus selectively expresses the propeptide region because of the deletion of ancestral domains. Hence, this do- main experiences significant selection pressure and evolves under the influence of positive selection (Fig. 3).
We show that positive selection has influenced the Psammophis monodomain SVMP propeptide domain more than its E. corolatus multidomain SVMP counterpart. A few species of Echis that express similar propeptide-only domains also express the regular multidomain SVMPs. Hence, they do not require the same level of variation as the Psammophis protein.
Evidently, these Echis monodomain propetide-only SVMP seem to be subject to the same regime of negative selection as the propeptide domain expressed as part of the multidomain gene. Thus, the molecular evolution patterns and expression levels are consistent with a significant role being played by the propeptideonly toxins in the venom of P. mossambicus (extreme diversification and high expression) but not in Echis sp. (little diversification and low expression). This work reveals that the neglected proline-rich peptides are a source of novel ligands that, because of their uniqueness and small size, may prove to be of significant value in drug design and development.
Functional testing of P. mossambicus SVMP propeptide only domain variants Pm1 and Pm2 yielded responses on neuronal nAChRs, specifically on the human ␣7 neuronal receptor (Fig. 5). Alpha7 neuronal receptors, which are nAChRs comprised solely of ␣7 subunits, have been documented as being convergently targeted by toxins such as a-conotoxin PnIA (A10L D14K), which expresses a high affinity for the ␣7 neuronal receptor (63). Eleven subunits of mammalian neuronal nAChRs have been classified to date; eight a subunits (␣2-7, ␣9, ␣10) and three ␤ subunits (␤2-4), with an additional a subunit; a8, present in the chick optic nerve (64). Similarities in receptors are apparent between species as it had been documented that neuronal nAChR subunits in the chicken, in particular ␣2, ␣3, and ␣4 subunits, express a homology of 85% in conserved domains when compared with that of a human or mouse (Nef et al., 1988). Despite the various similarities between chick and mammalian neuronal, and potentially neuromuscular, nAChR subtypes and their respective encoded genes, there are large disparities in the pharmacology of the expressed receptors between the two species (66), which may account for species-specific toxins. One such example is denmotoxin, an avian-specific post-synaptic neurotoxin isolated from Boiga dendrophila, which expresses potent activity at neuromuscular nAChRs in the chick but an effect of lower magnitude in the mouse (67). Dissimilar to neuronal nAChRs, neuromuscular nAChRs consist of five subunits comprising of ␣ and ␤ in addition to ␥, ␦, and subunits with a stoichiometry of (␣1)2␤1␥␦ in which the subunit replaces the ␥ subunit in developed forms of the receptor (68). Neuromuscular nAChRs have been documented as being targeted by ␣-neurotoxins, also referred to as curare-mimetic or post-synaptic neurotoxins (69), an example of which includes acantoxin IVa isolated from Acanthophis sp. seram (70). An absence of any activity of Pm1 and Pm2 at the muscle end plate nAChR in the chick biventer cervicis and activity on human ␣7 neuronal receptors confirms that both variants affect neuronal opposed to neuromuscular nicotinic acetylcholine receptors.
The two peptides tested in this study are good examples that, because of the extreme selection pressure they are subjected to, venom components often have exquisitely subtle sources of functional variation. Despite varying by only a single amino acid (Y for N at the second to last position) there was a slight but significant difference in potency between the two forms of Psammophis propeptide, with Y-containing Pm2 the more potent of the two. The peptides from Psammophis and Azemiops are a remarkable example of convergent evolution using two different gene types as the starting material (SVMP and CNP respectively). In both cases, there was the de novo evolution of proline-rich peptides in the propeptide domain of a precursor. Addtionally, there was a degree of functional convergence as both target the nicotinic acetylcholine receptor, although each targets a different subtype (neuronal and neuromuscular respectively). Both the variation between the two Psammophis peptides and the convergence of the Psammophis and Azemiops peptides reinforce the wealth of novel peptides to be found in understudied snake venoms. □ S This article contains supplemental Fig. S1 and Tables S1 and S2.
ʈʈ Joint first authors.