Isolation and Characterization of Insecticidal Toxins from the Venom of the North African Scorpion, Buthacus leptochelys

Various bioactive peptides have been identified in scorpion venom, but there are many scorpion species whose venom has not been investigated. In this study, we characterized venom components of the North African scorpion, Buthacus leptochelys, by mass spectrometric analysis and evaluated their insect toxicity. This is the first report of chemical and biological characterization of the B. leptochelys venom. LC/MS analysis detected at least 148 components in the venom. We isolated four peptides that show insect toxicity (Bl-1, Bl-2, Bl-3, and Bl-4) through bioassay-guided HPLC fractionation. These toxins were found to be similar to scorpion α- and β-toxins based on their N-terminal sequences. Among them, the complete primary structure of Bl-1 was determined by combination of Edman degradation and MS/MS analysis. Bl-1 is composed of 67 amino acid residues and crosslinked with four disulfide bonds. Since Bl-1 shares high sequence similarity with α-like toxins, it is likely that it acts on Na+ channels of both insects and mammals.


Introduction
Scorpions are the oldest arachnids and can be traced back to the Silurian period [1,2]. Currently, over 2400 scorpion species are widely distributed on all continents except Antarctica [3,4]. Scorpions have adapted to different environments such as deserts, forests, grasslands, and caves because they can use toxic components in their venom to effectively capture prey and protect themselves from predators. Scorpion venom contains inorganic salts, amino acids, nucleic acids, peptides, and proteins, and some peptides show anti-insect and/or anti-mammal activities [5].

Purification of Insecticidal Peptides
The crude venom dissolved in distilled water was separated on a C4 HPLC column (Figure 2A). Several fractions obtained based on the major chromatographic peaks were tested for insect toxicity using crickets. In this study, each fraction was injected at a dose equivalent to 160 ng venom/mg body weight, which is five times higher than the LD50 value, to specify peptides primarily responsible for the insect toxicity of the venom. Three fractions (I-III) were found to be highly toxic, and fraction I showed the strongest toxicity, which induced death or paralysis 48 h after injection in all insects tested. Fractions II and III showed relatively weak toxicity that induced transient paralysis in all insects tested. Differences in the insect toxicity between fractions could be attributed to the number and/or amount of active components in each fraction as well as to their intrinsic activity. These fractions were further separated on a C18 HPLC column to obtain single components. Four components (named Bl-1, 2, 3, and 4; Figure 2B-D) showed toxicity, and their monoisotopic molecular masses were determined as 7107.2, 7343.9, 7173.8, and 7828.1 Da, respectively ( Figure S1). The N-terminal sequences of these peptides were determined by Edman method as shown in Table 1.

Purification of Insecticidal Peptides
The crude venom dissolved in distilled water was separated on a C 4 HPLC column ( Figure 2A). Several fractions obtained based on the major chromatographic peaks were tested for insect toxicity using crickets. In this study, each fraction was injected at a dose equivalent to 160 ng venom/mg body weight, which is five times higher than the LD 50 value, to specify peptides primarily responsible for the insect toxicity of the venom. Three fractions (I-III) were found to be highly toxic, and fraction I showed the strongest toxicity, which induced death or paralysis 48 h after injection in all insects tested. Fractions II and III showed relatively weak toxicity that induced transient paralysis in all insects tested. Differences in the insect toxicity between fractions could be attributed to the number and/or amount of active components in each fraction as well as to their intrinsic activity. These fractions were further separated on a C 18 HPLC column to obtain single components. Four components (named Bl-1, 2, 3, and 4; Figure 2B-D) showed toxicity, and their monoisotopic molecular masses were determined as 7107.2, 7343.9, 7173.8, and 7828.1 Da, respectively ( Figure S1). The N-terminal sequences of these peptides were determined by Edman method as shown in Table 1. A BLAST search revealed that Bl-1 is similar to α-like insect and mammal toxins such as Lqh3 from Leiurus quinquestriatus hebraeus [31] and Bom3 from Buthus occitanus mardochei [32]. Bl-2 is similar to α-mammalian toxins such as Lqq5 from L. quinquestriatus quinquestriatus [33] and AaH2 from Androctonus australis Hector [34]. Bl-3 is similar to α-insect and mammalian toxins such as OD1 from Odontobuthus doriae [35] and Bu-1 from Buthacus macrocentrus [25]. Bl-4 is similar to β-insect toxins such as AaHIT1 from A. australis Hector [36] and LqqIT1 from L. quinquestriatus quinquestriatus [37]. This suggests that the four insecticidal peptides isolated from the B. leptochelys venom may act on the insect Na + channel, although the selectivity of their action between mammals and insects may vary among peptides. The αand β-toxins have been identified exclusively from the venom of the Buthidae scorpions, but the ratio between the number of αand β-toxins in the venom is known to differ by species. For example, 9 αand 12 β-toxins were identified from the venom gland transcriptome of Lychas mucronatus [38], whereas 1 αand 12 β-toxin sequences were identified from the I. maculatus transcriptome [39]. In addition, only β-toxins were isolated as an insecticidal neurotoxin from the I. maculatus venom [40,41]. The fact that the insecticidal activity of B. leptochelys venom is relatively higher than that of I. maculatus may be attributed to the existence of multiple α-toxins.

Primary Structure of Bl-1
Bl-1 was further subjected to sequencing analysis to obtain its complete primary structure because it showed the most significant insect toxicity in this study. A 472 Da mass shift after carboxymethylation of Bl-1 is indicative of the presence of eight Cys residues (59 Da × 8) that form four disulfide bridges ( Figure S2). Bl-1 was digested with endoproteinase Lys-C, and the resulting peptide fragments were purified by HPLC ( Figure S3). The sequence of three fragments (L1, L2, and L3 with molecular masses of 3282.3, 3918.9, and 2686.3 Da, respectively) were determined by Edman and/or MS/MS sequencing analysis ( Figure S4). Discrimination between Leu and Ile at several positions in the fragments during MS/MS analysis was achieved based on the side-chain fragmentation observed under HE-CID conditions in which the occurrence of key fragment ions (d-ions) allowed for its assignment ( Figure S5). The sequences of L1 and L3 were determined as ARDGYISQPENCVYHCFPGSSG(CD/DC)TLCK and EGRGLACWCLELPDNVGIIVDIGK, respectively, by combination of Edman and MS/MS sequencing analysis ( Figure 3). The N-terminal sequence of L2 was also determined as EKGGTGGHCGYKEGRGLA by Edman analysis, but other sequence information was not obtained by MS/MS analysis due to its large molecular mass. To assign the undefined sequence of cysteine and aspartic acid residues (CD or DC) in L1, Bl-1 was sequentially digested with Lys-C and chymotrypsin ( Figure S3). The short fragment LC1 (molecular mass of 1628.6 Da) consisting of the C-terminal half of L1 was subjected to MS/MS analysis, and its sequence was determined as HCFPGSSGCDTLCK ( Figure 3 and Figure S6). Moreover, carboxymethylated Bl-1 was digested with chymotrypsin, and the peptide fragment C1 (molecular mass of 2154.1 Da) was purified by HPLC ( Figure S3). MS/MS analysis revealed the sequence of C1 as CLELPDNVGIIVDIGKCHT-NH 2 by considering the sequence of L3 ( Figure 3 and Figure S6). Since C1 has the amidated C-terminus, it was assigned as the C-terminal end of Bl-1 ( Figure 4). The fragment C2 (molecular mass of 5444.2) was also detected by LC/MS analysis, which confirms the connection between L1 and L2 based on its molecular mass and partially determined sequence ( Figure S7). Finally, the complete primary structure of Bl-1 was successfully determined by integrating all the information obtained above (Figure 4).

Sequence Comparison
A BLAST search for a full sequence of Bl-1 revealed that the peptide is similar to scorpion peptides classified as an α-like toxin, which can modulate both insect and mammalian Na + channels as described above ( Figure 5A). This suggests that Bl-1 also shows toxicity against mammals, although mammal toxicity could not be evaluated due to the limited amount of the purified sample. Among the α-like toxins, the structure-activity relationship was comprehensively investigated for Lqh3 [42]. This study revealed that two distinct domains are particularly important for its binding to insect Na + channels. One (Core-domain) consists of three residues (His15, Phe17, and Pro18) preceding the α-helix and two residues (Phe39 and Leu45) in the β-strands. The other (NC-domain) is constituted by the C-terminal region, where Ile59, Lys64, and His66 contribute the activity ( Figure  5B). These residues are also observed in Bl-1, except for Phe39. Since the aromatic ring of Phe39 is important for the activity, the substitution of Phe39 with Tyr in the case of Bl-1 may not affect the activity. To further confirm the structural similarity between Lqh3 and Bl-1, a three-dimensional structure of Bl-1 was constructed by homology modeling using Lqh3 as a template ( Figure 5B). As expected, positions of all amino acid residues important for expression of full activity were almost identical between Lqh3 and Bl-1. This suggests that Bl-1 exerts its insect toxicity through the same mechanism as α-like toxins such as Lqh3.

Sequence Comparison
A BLAST search for a full sequence of Bl-1 revealed that the peptide is similar to scorpion peptides classified as an α-like toxin, which can modulate both insect and mammalian Na + channels as described above ( Figure 5A). This suggests that Bl-1 also shows toxicity against mammals, although mammal toxicity could not be evaluated due to the limited amount of the purified sample. Among the α-like toxins, the structure-activity relationship was comprehensively investigated for Lqh3 [42]. This study revealed that two distinct domains are particularly important for its binding to insect Na + channels. One (Core-domain) consists of three residues (His15, Phe17, and Pro18) preceding the α-helix and two residues (Phe39 and Leu45) in the β-strands. The other (NC-domain) is constituted by the C-terminal region, where Ile59, Lys64, and His66 contribute the activity ( Figure 5B). These residues are also observed in Bl-1, except for Phe39. Since the aromatic ring of Phe39 is important for the activity, the substitution of Phe39 with Tyr in the case of Bl-1 may not affect the activity. To further confirm the structural similarity between Lqh3 and Bl-1, a three-dimensional structure of Bl-1 was constructed by homology modeling using Lqh3 as a template ( Figure 5B). As expected, positions of all amino acid residues important for expression of full activity were almost identical between Lqh3 and Bl-1. This suggests that Bl-1 exerts its insect toxicity through the same mechanism as α-like toxins such as Lqh3.
Toxins 2019, 11, x FOR PEER REVIEW 6 of 11 determinated by Edman degradation is shown. Leu and Ile residues that are underlined were determined by MS/MS analysis inder the HE-CID conditon.

Sequence Comparison
A BLAST search for a full sequence of Bl-1 revealed that the peptide is similar to scorpion peptides classified as an α-like toxin, which can modulate both insect and mammalian Na + channels as described above ( Figure 5A). This suggests that Bl-1 also shows toxicity against mammals, although mammal toxicity could not be evaluated due to the limited amount of the purified sample. Among the α-like toxins, the structure-activity relationship was comprehensively investigated for Lqh3 [42]. This study revealed that two distinct domains are particularly important for its binding to insect Na + channels. One (Core-domain) consists of three residues (His15, Phe17, and Pro18) preceding the α-helix and two residues (Phe39 and Leu45) in the β-strands. The other (NC-domain) is constituted by the C-terminal region, where Ile59, Lys64, and His66 contribute the activity ( Figure  5B). These residues are also observed in Bl-1, except for Phe39. Since the aromatic ring of Phe39 is important for the activity, the substitution of Phe39 with Tyr in the case of Bl-1 may not affect the activity. To further confirm the structural similarity between Lqh3 and Bl-1, a three-dimensional structure of Bl-1 was constructed by homology modeling using Lqh3 as a template ( Figure 5B). As expected, positions of all amino acid residues important for expression of full activity were almost identical between Lqh3 and Bl-1. This suggests that Bl-1 exerts its insect toxicity through the same mechanism as α-like toxins such as Lqh3.

Conclusions
We characterized venom components of the North African scorpion, B. leptochelys, by mass spectrometric analysis and isolated the insecticidal peptides by the bioassay-guided fractionation approach. To our knowledge, this is the first report of the chemical and biological characterization of the B. leptochelys venom. Mass spectrometric analysis revealed that the venom components are mainly composed of two distinct groups based on the molecular mass ranges: one from 3000-5000 Da and the other from 7000-8000 Da, which is commonly observed for Buthidae scorpions. N-terminal sequences of four insecticidal peptides (Bl-1, Bl-2, Bl-3, and Bl-4) isolated from the B. leptochelys venom indicated that they are long-chain toxins that could specifically or non-specifically act on insect Na + channels. Among them, the primary structure of Bl-1 was completely determined to be an α-like toxin, which is likely to act on both insect and mammal Na + channels, by combination of Edman and MS/MS sequencing analysis. Insect toxicity is the common biological characteristic of scorpion venom, but the structure of insecticidal toxins and their combinations are diverse and complex among scorpion species. The results obtained in this study will provide a clue to understanding the synergistic role of αand β-toxins in insecticidal activity in Buthidae scorpion venom.

Collection of Venom
Scorpions B. leptochelys were collected at the Western Mediterranean coastal desert of Marsa Matruh in Egypt. The venom was collected in a microtube by squeezing the venom glands using fine forceps and dissolving it in distilled water. The crude venom was centrifuged at 14,000 rpm for 10 min at 4 • C. The supernatants were pooled, lyophilized, and stored at −80 • C.

Bioassay
Insect toxicity was tested by injection of 1-2 µL sample solutions in distilled water into the abdominal cavity of crickets (Acheta domesticus, 50 ± 5 mg body weight). Distilled water was injected as a negative control. Several doses of the venom were injected, and ten animals were used for each dose. For evaluation of each HPLC fraction, six animals were used. The number of paralyzed or dead animals were counted 48 h after injection. The dose required to induce 50% mortality (LD 50 ) was calculated by statistical software GraphPad Prism 4 (GraphPad Software, San Diego, CA, USA). The research using experimental animals was approved by the Animal Experimentation Committee at Kyoto University (Permission number: 30-8; date of approval: 1 April 2018).

Mass Spectrometric Analysis
LC/MS and LC/MS n measurements were carried out in the positive mode on an LCMS IT-TOF (Shimadzu, Kyoto, Japan) equipped with an electrospray ion source. HPLC separation was carried out on a reversed-phase C 18 column (TSK-GEL, 1.0 mm ID × 150 mm, TOSOH, Tokyo, Japan). The column was eluted using a linear gradient from 5 to 60% solvent B (0.1% formic acid in acetonitrile) in solvent A (0.1% formic acid in water) for 110 min at a flow rate of 0.05 mL/min. The mass scale was externally calibrated using sodium trifluoroacetate cluster ions. Spectra were obtained over a mass range from m/z 400 to 2000, and the multiply charged molecular ions were manually deconvoluted to obtain molecular masses. The monoisotopic m/z values in each multiply charged ion were used for deconvolution.
MALDI-TOF/TOF MS measurements were carried out on an Autoflex III smart beam (Bruker Daltonics, Billerica, MA, USA) with a nitrogen pulsed laser (337 nm). Samples were dissolved in 0.1% TFA in 50% acetonitrile/water and mixed with a matrix solution containing 10 mg/mL of α-cyano-4-hydroxycinnamic acid (CHCA) in acetone. An aliquot (0.5 µL) of matrix/acetone solution was spotted onto the MALDI sample target to generate a thin layer of matrix crystal. Then, 1 µL of the matrix/0.1% TFA in 50% acetonitrile/water solution was spotted onto the thin layer and then dried at room temperature. External calibration of the mass scale was carried out using the molecular masses

HPLC Purification
The crude venom (2.0 mg) was dissolved in distilled water and separated by HPLC on a reversed-phase C 4 column (Protein C 4 , 10 mm ID × 250 mm, Grace Vydac, Deerfield, IL, USA). The column was eluted using a linear gradient from 15 to 60% solvent D (0.08% TFA in acetonitrile) in solvent C (0.1% TFA in water) for 50 min at a flow rate of 2 mL/min. Elution was monitored by UV absorbance at 215 and 280 nm. Seven fractions were obtained based on the major chromatographic peaks, and each fraction was subjected to the insect toxicity tests as described above. Fractions that showed insect toxicity were further separated by a reversed-phase C 18 column (Everest C 18 , 1.0 mm ID × 250 mm, Grace Vydac). The column was eluted with solvent C and D at a flow rate of 0.05 mL/min using a linear gradient from 20 to 50% solvent D for 45 min (fraction I), from 15 to 60% solvent D for 45 min(fraction II), and from 20 to 40% solvent D for 45 min (fraction III). Chromatographic peaks obtained from each fraction were subjected to insect toxicity tests to find the toxic component. The monoisotopic molecular mass of these components was obtained by LC/MS analysis as described above.

Determination of N-terminal Sequence
The peptide (200 pmol) was dissolved in the 0.2 M Tris (pH 8.0) buffer (30 µL) containing 6 M guanidine-HCl. To the solution was added 10 µL of 45 mM dithiothreitol (DTT) that was incubated for 1 h at 50 • C. Then, the reaction mixture was mixed with 10 µL of 100 mM iodoacetic acid to alkylate Cys side chains and incubated for 1 h at 28 • C in the dark. The peptide with carboxymethylated Cys residues was purified by HPLC and subjected to Edman sequencing analysis (PPSQ-21A, Shimadzu).

Enzymatic Digestion
For Lys-C digestion, the peptide solution after carboxymethylation reaction was diluted with a twofold volume of distilled water, which was mixed with Lys-C (Wako Pure Chemical Industries, Osaka, Japan) in an enzyme to a substrate ratio of 1:100 (w/w). After incubation for 18 h at 37 • C, digested peptide fragments were subjected to HPLC separation on a C 18 column (TSK-GEL, 1.0 mm ID × 150 mm, TOSOH) eluted using a linear gradient from 15 to 60% solvent D in solvent C for 45 min at a flow rate of 0.05 mL/min. For chymotrypsin digestion, the peptide solution after carboxymethylation reaction was purified by HPLC on a C 18 column (Everest C 18 , 1.0 mm ID × 250 mm, Grace Vydac) eluted using a linear gradient from 5 to 90% solvent D in solvent C for 85 min at a flow rate of 0.05 mL/min. After lyophilization, the purified peptide was dissolved in 100 µL of distilled water and mixed with chymotrypsin (Roche Diagnostics K.K., Tokyo, Japan) in an enzyme to a substrate ratio of 1:100 (w/w). After incubation for 18 h at 37 • C, digested peptide fragments were subjected to HPLC separation on a C 18 column (Everest C 18 , 1.0 mm ID × 250 mm, Grace Vydac) eluted with solvent A and B at a flow rate of 0.05 mL/min using a linear gradient from 5 to 90% solvent B for 85 min. For sequential digestion with Lys-C and chymotrypsin, the peptide solution after Lys-C digestion for 18 h at 37 • C was mixed with chymotrypsin. After incubation for 18 h at 37 • C, digested peptide fragments were subjected to HPLC separation on a C 18 column as described above.

Homology Modeling
To construct the three-dimensional model, homology modeling software Isolated-FAMS (In-Silico Sciences Inc., Tokyo, Japan) was used [44]. The primary sequence of Bl-1 was automatically aligned with that of Lqh3, and the structure of each toxin was optimized by simulated annealing method of FAMS-ligand using the coordinate of Lqh3 (PDB ID: 1FH3) as a template.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6651/11/4/236/s1, Table S1. List of monoisotopic molecular masses of the venom components obtained by LC/MS analysis. Numbers shown in bold indicate the peptides isolated in this study. Figure S1. Mass spectra of Bl-1, 2, 3, and 4. Figure S2. Results of LC/MS analysis of native (A) and carboxymethylated Bl-1 (B). Figure S3. HPLC chromatograms of peptide fragments obtained by degestion with Lys-C (A), with chimotrypsin after Lys-C (B), and with chymotrypsin (C). Figure S4. Product ion spectra of L1 obtained by LC/MS/MS (A) and MALDI-TOF/TOF MS analysis (B). Product ion spectra of L3 obtained by LC/MS/MS (C) and MALDI-TOF/TOF MS analysis (D). Figure S5. Product ion spectra of L1 (A) and L3 (B-F) obtained by MALDI-TOF/TOF MS analysis under HE-CID condition. The mass region containing dand aions necessary for Leu/Ile discrimination was shown. Vertical solid arrows show observed d-ions. Figure S6. Product ion spectra of LC1 (A) and C1 (B) obtained by MALDI-TOF/TOF MS analysis under HE-CID conditions. Figure