Venomics Study of Protobothrops flavoviridis Snake: How Venom Proteins Have Evolved and Diversified?

Venomics projects have been conducted to disclose the divergent profiles and evolution of various venomous animals. Here, we describe the venomics project including genome and transcriptome of habu snake, leading to drug discovery. Venomics project including the decoding of their whole genomes revealed partly a producing mechanism of various venom proteins including accelerated evolution and alternative splicing and how the toxic organisms have evolved from the nontoxic ones. In addition, the venomics analysis of transcriptomes and proteomes beyond species reveals the relationship between the geographical distribution and evolution of toxic organisms. The abundance of different gene products within a gene family caused by accelerated evolution and alternative splicing may contribute to expand the repertoire of effective weapons to prey capture accompanied with neofunctionalization.


Introduction
A wide variety of creatures, bio-diversified life, have been evolved over 3.8 billion years on earth. Some organisms can produce the biological weapons called "toxins." How have these poisonous organisms evolved to become poisonous? Recent progress on genome sequencing technology has made it possible to analyze the whole genomes of non-model organisms other than model ones and become an important tool for understanding their evolutionary history [1,2]. "Venomics project" has been undertaken, in which the genomes of a venomous animals are deciphered, and their entire contents of venom are revealed by proteomics and transcriptomics in addition to genome analysis.

Toxin-producing organisms
There are various creatures with "toxins" in nature. Toxin derived from living organisms are originally toxic organic compounds such as alkaloids and polyethers produced by plants and phytoplankton and then accumulated through the food chain. On the other hand, the toxins produced by venomous animals are called

Venomics projects
The "venomics projects," international joint research projects on various venomous animal genome, transcriptome, and proteome analyses, were proposed at the 14th International Society on Toxinology (IST) held in Adelaide, Australia, in 2003 [3]. These projects aim to obtain the common information to toxic organisms such as snakes, scorpions, spiders, bees, poisonous frogs, cone snails, sea anemones, jellyfishes, etc. They also provided new knowledge that leads to understanding the venom production and transport systems, molecular mechanisms of diversity of venomous proteins, search for new toxic components related to the drug discovery and pharmacological agents that directly relate to unmet medical needs for diseases, and new therapeutic treatments for venom animal bites. For example, the European Venomics Project (completed in October 2015) was based on the several omics analyses (mainly proteome and transcriptome analyses) of 203 venomous animal species ranging from scorpions, corn snails, poisonous spiders, snakes, and lizards, resulting in the identification of 25,000 toxic protein/peptide sequences, of which 4000 were functionally analyzed [4].
In Japan, to elucidate the novel toxins and the diversification mechanism of venom proteins by accelerated evolution, we deciphered the whole genome sequence of habu snake (Protobothrops flavoviridis).
Here, we describe what we have learned from the venomics analyses on the genome and transcriptome decoding of habu snake (P. flavoviridis).

What is habu snake, P. flavoviridis?
Habu snakes inhabiting in Nansei Islands (Southwest Islands) of Okinawa and Kagoshima prefectures are the most dangerous domestic snakes in Japan (Figure 1). Due to their relatively large body size, long attacking range, and a large amount of delivered venom, still many snakebites and envenoming occur especially during farming (about 80 to 100 cases per year). While habu snakes are specific animals designated by Japanese laws, they are subject to extermination as sanitary animals in many habitats, and some are also consumed commercially such as habu liquor and leather products including the Okinawan musical instrument, Sanshin.
Among the 14 species of Protobothrops (The Reptile Database: http://www. reptile-database.org/), three species, P. flavoviridis from the Amami and the Okinawa islands, P. tokarensis from the Tokara Islands, P. elegans from the Yaeyama Islands, are endemic to Japan (Figure 2A).
In addition to Protobothrops, Ovophis okinavensis (hime-habu) are distributed from the Amami and Okinawa islands. From the view of geographical history of the Nansei Islands of Japan and Taiwan, it was expected that these Protobothrops snakes including the Taiwan habu (P. mucrosquamatus), which are distributed in Taiwan, have been diversified from the beginning of the Quaternary Pleistocene to 2.0 million years ago when the islands began to be separated from the continent. Isolated environment on each island resulted in the differentiation and the   speciation of Protobothrops species. Molecular phylogenetic analyses using the full-length mtDNA genome sequences of Protobothrops snakes showed that the habu snake (P. flavoviridis) is close to the Tokara habu (P. tokarensis) and Sakishima habu (P. elegans) is close to the Taiwan habu (P. mucrosquamatus), respectively [22]. These observations are consistent with the geographical history of the Nansei Islands, that is, Okinawa-Amami islands and Taiwan-Yaeyama islands were first separated by Yangtze River (corresponding to the Kerama Gap) and diverged into two groups, the habu (P. flavoviridis) and Tokara habu (P. tokarensis) species groups and the Sakishima habu (P. elegans) and Taiwan habu (P. mucrosquamatus) species groups. Further, a remarkable genetic gap between the Amami and Okinawa clades within P. flavoviridis was observed. Interestingly, the Tokara habu (P. tokarensis) was found to be genetically very close to the Amami clade of P. flavoviridis than the Okinawa clade. This indicates that some populations of the Amami clade of P. flavoviridis have distributed on the Tokara Islands (Takara and Kodakara islands) and become differentiated to the Tokara habu (P. tokarensis) after the divergence of Amami and Okinawa clades. In addition, the Sakishima habu and Taiwan habu diverged as the Yaeyama Islands are separated from Taiwan due to the Yonaguni Gap ( Figure 2B). Due to the gap of the mouth of the old Yellow River (equivalent to the Tokara Gap), there is no Protobothrops snake in the mainland of Japan beyond the Tokara Gap. In summary, the evolutionary history of the speciation of Protobothrops in the Nansei Islands is closely associated with the geographical history of the islands.
Snake venoms are potentially lethal complex mixtures composed of proteins and peptides encoded by multigene families that function specific but synergistically to incapacitate the prey or opponent. Venom components can be classified based on their effects as neurotoxic, cardiotoxic, cytotoxic, and hemorrhagic. The viper venoms are known as hemorrhagic toxins that include a wide variety of physiological activities such as metalloproteases (MPs) that destroy blood vessels, phospholipase A 2 (PLA2) that causes inflammation and necrosis, C-type lectin-like proteins (CTLP) and serine proteases (SP) that effect on blood clotting, and so on. Since each of these peptide/protein toxins has very high specificity, it is expected to be a useful tool for clarifying the complex mechanism of life and as a pharmaceutical lead compound. To fully characterize snake venom repertoires and to understand the molecular mechanisms involved in evolution and physiological functions of snake venoms, "venomics studies" including whole genome decoding has been much anticipated.

Habu venomics: decoding of the habu genome reveals the evolutionary mechanism of venom-related genes that create a wide variety of venoms
The genome of habu (P. flavoviridis) consists of 8 pairs of macro-chromosomes including ZW sex chromosomes and 10 pairs of micro-chromosomes (total 2n = 36). The genome size was estimated to be approximately 1.8 Gb or 1.41 Gb in size by FACS and k-mer analysis, respectively [15]. Recently, we decoded the whole genome sequence of habu (P. flavoviridis) snake, that is, a total of 136 Gb of shotgun sequences were analyzed and successfully decoded with a sequencing depth of about 96-fold, resulting in the draft genome of habu snake, HabAm1, that include 25,134 protein-coding genes ( Table 1) [15]. Among 20,540 annotated gene models of HabAm1, we validated 284 genes as venom-related genes, 60 toxic protein genes (SV), and 224 of their non-venom paralog genes (NV). Finally, 18 gene families can be identified as venom-related genes, that is, metalloprotease, serine protease, C-type lectin-like protein, phospholipase A 2 , three-finger toxin (3FTX), aminopeptidase (APase), Cys-rich secreted protein (CRISP), 5 'nuclease (5Nase), hyaluronidase (Hyal), nerve growth factor (NGF), vascular endothelial growth factor (VEGF), L-amino acid oxidase (LAAO), bradykinin-enhancing peptide and C-type diuretic peptide (BPP and CNP), and so on (Figure 3).
Furthermore, the venom-related genes can be classified into three categories according to the degree of gene duplication (Figure 3). Category III consists of four gene families of MP, SP, CTLP, and PLA2, which are major components of venom and highly multiplexed in both SV gene copies and NV paralog. Category II includes 3FTX, APase, and CRISP, which showed moderate multiplexing in both SV gene copies and NV paralogs. Finally, category I, which consists of only 1 SV copy and 2 to 10 copies of NV paralogs, contained other venom-related genes such as LAAO, NGF, VEGF, Hyal, 5Nase, etc. Phylogenetic analyses of these venom-related genes revealed the unique evolutionary aspects of venomous proteins, that is, only one gene out of four copies have gained venom functions during two-round whole genome duplications (2R-WGD) that occurred in the early evolution of vertebrates.
The accelerated evolution phenomenon in venom proteins was first found in the habu snake PLA2 genes [23,24] and was later found in other animal venom proteins and peptides such as conotoxin [25,26], scorpion toxins [27,28], and spider toxins [29]. Although accelerated evolution has been demonstrated in the genes involved in the biodefence molecule and reproduction in addition to the toxin genes [30], their mechanisms are unknown. Using the complete set of SV and NV gene families in the habu genome, molecular evolution rates analysis by computing numbers of synonymous (K S ) and non-synonymous (K A ) nucleotide substitutions per site for each pair suggested that accelerated evolution was observed only in category III and category II, such as SP, PLA2, and CTLP (K A /K S ratios: mean +/−SE =1.047 +/−0.438 for svMPs, 1.253 +/− 0.090 for svSPs,  0.871 +/− 0.071 for svCTLPs, and 1.093 +/− 0.062 for svPLA2s) [15]. On the other hand, the venom-related genes in category I and NV paralogs in all categories I-III showed no accelerated evolution (K A /K S ratios: 0.512 +/− 0.018).
Furthermore, RNA-seq (total of 1.7 billion read pairs, 348 Gb sequence, 1.11 million transcripts identified) from 18 tissues of habu snake and the comprehensive transcript analysis in the venom gland by using PacBio sequencing (~97,000 transcripts) were conducted [31]. Extensive alternative splicing was observed in three venom protein gene families, metalloproteinase (MP), serine protease, and vascular endothelial growth factors (VEGF) with a total of 81, 65, and 8 transcript variants, respectively (Figure 3). Especially, svMP showed that over 80 splice variants were transcribed from 11 genes diversified by gene duplication. MPs are key toxins that cause venom-induced pathogenesis such as hemorrhage, fibrinolysis, and apoptosis. According to their domain architecture, svMPs are classified into four groups (P-I to P-IV) ( Figure 4A). P-I type MPs possess only the metalloproteinase domains and are largely non-hemorrhagic. P-II type MPs contain MP domains and disintegrin domains. P-III type MPs contain Cys-rich domains as well as MP and disintegrin domains. P-IV type MPs harbor lectin-like domains linked by disulfide bonds to the P-III-like structures. These different types of MP proteins can be produced from single MP genes not only by proteolytic processing but also alternatively splicing, resulting in a wider variety of svMPs and disintegrin peptides ( Figure 4B).
Thus, the alternative splicing is involved in a mechanism for generating diversity of venom proteins in addition to the accelerated evolution [15,31]. The abundance of different gene products within a gene family caused by accelerated evolution and alternative splicing may contribute to expand the repertoire of effective weapons to prey capture accompanied with neofunctionalization.

What comes from venomics project
What did we learn from the "venomics" researches including the decoding of their whole genomes? It revealed partly a producing mechanism of various venom proteins including accelerated evolution and alternative splicing and how the toxic organisms have evolved from the nontoxic ones. In addition, the "venomics" analysis of transcriptomes and proteomes beyond species reveals the relationship between the geographical distribution and evolution of toxic organisms. Recent transcriptomic and proteomic analyses of several snake venoms have reconfirmed in detail that snake venom variation often occurs between individuals of not only interspecifically but also intra-specifically, of which distributions are different geographic locations, diverse environment, and eating habits [32]. For example, a proteomic analysis of 18 species of the genus Micrurus snakes in the American continent revealed that the toxic compositions of the major neurotoxins, PLA2, and 3FTX dramatically vary from species to species [32]. Terciopelo (Bothrops asper) inhabiting Costa Rica has been also shown to have different toxic compositions between populations from the Pacific coast and from the Caribbean coast. In addition another specie from the same genus, kaisaka (Bothrops atrox) inhabiting the same Latin America, also has been shown to have different venom components between Colombia and Brazil [33,34]. These studies indicate that the composition and structure of the venom varies from region to region even within the same species and that the treatment with anti-venom for snakebites may not work in some areas due to the venom diversity. The envenoming by snakebites is estimated to be about 5 million people annually worldwide, of which about 125,000 die and 400,000 suffer from sequelae such as the loss of extremities [35]. Currently, although anti-venom is currently the only effective treatment for snakebites, there are some cases where the anti-venom production is discontinued due to the economical or political reasons. This serious situation was pointed out by the World Health Organization (WHO) as "neglected tropical disease" [36]. Venomics research is important to develop the anti-venom by using protein engineering techniques against unknown venom proteins, which are obtained by genome decoding, and to understand the mechanism of action of the venom. Venomics research will also lead to the discovery of new useful tools for clarifying the complex mechanisms of life and new functional molecules useful as pharmaceutical leads. For example, three-finger toxins, which have been known as major components of Elapidae and Hydrophiidae neurotoxins, were found in habu snake genome [15].
Whole genome analysis is a powerful tool to understand molecular mechanisms involved in snake venom evolution. We expect that the whole genome analyses of wider variety of venomous species will accelerate the acquisition of useful comprehensive information about different mixtures of venom proteins encoded by different sets of genes and the understanding of the evolutionary histories of venom systems and the common features of venomous animals.