Molecular Toxinology – Cloning Toxin Genes for Addressing Functional Analysis and Disclosure Drug Leads

The revolution in Biology started earlier with the genetic works of Gregor Mendel (1866), who, through his work with pea breeding, observed the phenomena of dominance and segregation of traits and discovered several laws of heredity. The pioneered endeavor of deciphering the linkage between transmission of heredity and a biomolecule was succeeded by the works of Griffith (1928), Oswald Avery, Colin McLeod, and Marylin McCarty (1944), who demonstrated that the instruction for virulence traits in bacteria was contained in the deoxyribonucleic acid (DNA) molecule, as well as Alfred Hershey and Martha Chase (1952), who elegantly proved that the progeny of bacteriophage is propagated after the injection of the bacteriophage’s genetic material into the host cell. The hallmark of molecular biology arose in 1953 with the description of the double-helix backbone of DNA by James Watson and Francis Crick, who also described that such a structure may suggest a mechanism of DNA replication (Watson and Crick, 1953). Notably, the deduction of the DNA structure was based on the data of other works by Erwin Chargaff, who determined the base correspondence and ratios in nucleic acid, and by Rosalind Franklin, who obtained DNA fiber images from X-ray diffraction. The elegant genetic experiments from François Jacob and Jacques Monod (1961) established the concept of cis-acting elements and the mechanism by which the operator and repressor regulate β-galactosidase expression in Escherichia coli and sugar metabolism (Jacob and Monod, 1961). At same time, Marshal Nirenberg was leading the race to decipher the genetic code (Nirenberg, 2004). By the 1970s, Fred Sanger and Walter Gilbert developed two distinct methodologies for DNA sequencing, which culminated with automated high-throughput DNA sequence analysis, thus opening the door for the genomic revolution and the publication of hundred of genomes, including the human genome. The central dogma of molecular biology, which postulated that DNA directs its own replication and its transcription to yield RNA, which in turn directs its translation to form protein, was wisely proposed by F. Crick in 1958 (Crick, 1970). This also included the “probable transference of information”, which we know now as RNA replication and reverse transcription, after the seminal works of D. Baltimore (1970) and J.M. Bishop on the molecular virology of retroviruses and oncogenes (1973). The ‘biological revolution’ continued forward with important discoveries such as the mechanism by which chromosomes are protected at their ends (the telomeres) against degradation and the


Introduction
The revolution in Biology started earlier with the genetic works of Gregor Mendel (1866), who, through his work with pea breeding, observed the phenomena of dominance and segregation of traits and discovered several laws of heredity.The pioneered endeavor of deciphering the linkage between transmission of heredity and a biomolecule was succeeded by the works of Griffith (1928), Oswald Avery, Colin McLeod, andMarylin McCarty (1944), who demonstrated that the instruction for virulence traits in bacteria was contained in the deoxyribonucleic acid (DNA) molecule, as well as Alfred Hershey and Martha Chase (1952), who elegantly proved that the progeny of bacteriophage is propagated after the injection of the bacteriophage's genetic material into the host cell.The hallmark of molecular biology arose in 1953 with the description of the double-helix backbone of DNA by James Watson and Francis Crick, who also described that such a structure may suggest a mechanism of DNA replication (Watson and Crick, 1953).Notably, the deduction of the DNA structure was based on the data of other works by Erwin Chargaff, who determined the base correspondence and ratios in nucleic acid, and by Rosalind Franklin, who obtained DNA fiber images from X-ray diffraction.The elegant genetic experiments from François Jacob and Jacques Monod (1961) established the concept of cis-acting elements and the mechanism by which the operator and repressor regulate β-galactosidase expression in Escherichia coli and sugar metabolism (Jacob and Monod, 1961).At same time, Marshal Nirenberg was leading the race to decipher the genetic code (Nirenberg, 2004).By the 1970s, Fred Sanger and Walter Gilbert developed two distinct methodologies for DNA sequencing, which culminated with automated high-throughput DNA sequence analysis, thus opening the door for the genomic revolution and the publication of hundred of genomes, including the human genome.The central dogma of molecular biology, which postulated that DNA directs its own replication and its transcription to yield RNA, which in turn directs its translation to form protein, was wisely proposed by F. Crick in 1958(Crick, 1970).This also included the "probable transference of information", which we know now as RNA replication and reverse transcription, after the seminal works of D. Baltimore (1970) and J.M. Bishop on the molecular virology of retroviruses and oncogenes (1973).The 'biological revolution' continued forward with important discoveries such as the mechanism by which chromosomes are protected at their ends (the telomeres) against degradation and the involvement of the enzyme (telomerase) that forms these ends by Elizabeth H. Blackburn, Carol W. Greider, and Jack W. Szostaktelomere (Blackburn, 2005) as well as the novel mechanism of gene regulation mediated by double-stranded RNA (interference RNA) that triggers suppression of gene activity by Andrew Z. Fire and Craig C. Mello (Mello and Conte, 2004;Fire, 2007).Most of these scientific conquests have been nicely compiled in the writing of Lander and Weinberg (2002), where the initial events of the Biological Revolution are outlined in more detail.Other actors have played a significant role in the field, but a review of this short time frame of scientific conquests may give us a taste of how the field of molecular biology experienced periods of excitement and 'epic' concretizations, culminating with what we know today.New avenues are being explored with the advent of two new disciplines that incorporate paradigmatic concepts and approaches to interrogate the complexity of life, namely, systems biology and synthetic biology.In the first case, the focus is on the analysis of complex biological systems from a holistic point of view, i.e., to study how individual genes and proteins interact to build entire organisms (single cell or multicellular) and allow them to operate properly.In the field of synthetic biology, principles of engineering are used to build from scratch living systems able to perform alternative functions not found in nature (Bader, 2011).Currently, recombinant DNA technology and DNA cloning represent an indispensable tool box for the research of distinct life science fields as diverse as environmental science, evolutionary biology, cell biology, microbiology, molecular medicine and pharmacology, and structural and systems biology.A particular field of interest that involves recombinant DNA technology and covers the study of ecological, biochemical, pharmacological and structural aspects of animal toxins is referred to in the scientific community as Molecular Toxinology.From an ecological point of view, animal venom might be considered an arsenal of organic and protein substances capable of immobilizing the competitor or prey by interfering with specific molecular targets in their cells and tissues.Therefore, animals equipped with venom glands and an inoculating apparatus have a significant fitness advantage.Biochemically and pharmacologically, chemical and protein diversities correlate with biodiversity, i.e., diverse indigenous animals inhabiting a given biome may offer novel compounds and bioactive molecules.Before the advent of molecular cloning and recombinant DNA technology, only the major components of venom were purified in high yield using protein chemistry techniques suited for functional characterization.Consequently, milligrams of purified toxin were required for studies by classical tissue-based assays, which limited the analysis of toxin biological activity to the molecular level.However, with the refinement of instrumentation in the context of 'omics' (e.g., genomics, transcriptomics, proteomics, and interactomics), toxins expressed in the venom even in minute amounts could be thoroughly examined.In this chapter, I wish to introduce some selected molecular biology techniques that can be applied to investigate the diversity of polypeptide molecules present in animal venoms.

Animal venom peptides and proteins as therapeutics
The global market for peptide and protein therapeutics was valued at over US$ 57 Billion in 2006 and estimates suggest that it will grow continuously in the next years at a 9.7% compound annual growth rate (CAGR) and reach a value of $103 billion by 2014 (Nair, 2011).According to these analyses, there is a high demand in the therapeutic proteins market for engineered monoclonal antibodies (MAb), insulins, cytokines, interferons and related immune modulators, enzymes, hematopoietic growth factors (erythropoietins) and coagulation factors.These polypeptide drugs are designed for the treatment of autoimmune and cardiovascular diseases, diabetes and cancer.The increasing demanding for therapeutic polypeptides is due in part to the excellent affinity and selectivity for the disease target displayed by such molecules, as well as biological compatibility.Currently, therapeutic peptides and proteins are manufactured by means of synthetic peptide chemistry and recombinant technology, respectively.Consequently, polypeptide drugs offer unparalleled opportunities for innovation in molecular design, improved pharmacokinetics, and disease target-delivery of therapeutics.Animal venom is a collection of molecules selected during millions of years of metazoan evolution with which specific tissue and target preferences are invariably observed among numerous families of toxins.In fact, snakes (Birrel et al., 2007), scorpions (Bringans et al., 2008), spiders (Estrada et al., 2007), and sea snails (Becker et al., 2008), to name just a few, produce and secrete a valuable diversity of toxins capable of interacting with distinct molecular targets within the cells of their prey or victims.Importantly, from the point of view of molecular evolution and medical biotechnology, using phylogenetic analysis, Fry (2005) inferred that polypeptide toxins secreted in a given venom evolved from endogenous bioactive protein genes, which were expressed early in tissues other than venom glands.For example, snake venom three finger toxins (3FTs) appear to have evolved from a common protein ancestor, such as the nicotinic acetylcholine receptor-binding LYNX, which is expressed in large projection neurons in the hippocampus, cortex and cerebellum.Other toxins that were related to ancestral proteins, for which the genes were preserved during the evolution of snake venom gland, include acetylcholinesterases, disintegrins/metalloproteinases, C-type lectins, complement C3, crotasin/defensin-like peptides, cystatins, endothelins, factor V, factor X, kallikrein, kunitz-type protease inhibitors, LYNX/SLUR, L-amino oxidase, natriuretic peptides/bradykinin potentiating peptides, nerve growth factor, phospholipase A2, and vascular endothelial growth factor.Considering the evolutionary conservation of polypeptides toxins and the excellent market opportunity for use a therapeutic peptide and proteins, animal toxins represent one unique source of ready-to-use engineered polypeptides capable of modulating vital human physiological and pathological processes.Numerous examples of the use of animal toxin in medicine have been reported.In their articles "Bugs as Drugs" (part I and II), E. P. Cherniack (2010;2011) reviews the use of a number of different 'bugs ' (worms, leeches, snail, ticks, centipedes, spider) and their metabolic products (endogenous or secreted) in medicine and describes the clinical benefits of such biological/pharmacological resources.As a particular biological effect is the result of a combination of a specific activity intrinsically contained in a single molecule, a myriad of new pharmacologically active compounds can be isolated from animal venom.For example, a dozen snake venom toxins belonging to several protein families such as C-type lectin, metalloprotease, phospholipase A2, and three-finger toxin display anticoagulant activities and a high potential for therapeutic use in preventing pathological clot formation (Kini, 2006).The snake venom components that act on the vertebrate blood coagulation cascade can be categorized, depending on their hemostatic action, as follows: enzymes that clot fibrinogen; enzymes that degrade fibrin (ogen); plasminogen activators; prothrombin activators; factor V activators; factor X activators; anticoagulants (inhibitors of prothrombinase complex formation, inhibitors of thrombin, phospholipases, and protein C activators); enzymes with hemorrhagic activity; enzymes that degrade plasma serine proteinase inhibitors; and platelet aggregation inducers (direct acting enzymes, direct acting non-enzymatic components, and inhibitors of platelet aggregation) (Markland, 1998).Some of these toxins have been clinically used as therapeutics or diagnostic reagents, while others are under preclinical trials (Fox and Serrano, 2007).For example, Exendin-4, a 39-amino acid peptide from the saliva of the lizard Heloderma suspectum, is able to improve blood sugar control in adults with type 2 diabetes mellitus, and it has been commercially registered as Byetta® (Exenatide).Furthermore, a number of toxins isolated from aquatic (including marine) and terrestrial animals represent the ultimate resource of novel molecules to treat highly prevalent cardiovascular diseases, such as high blood pressure and arrhytmias (Hodgson & Isbister, 2009), human neurological disturbances (Mortari et al., 2007), and cancer (Molinski et al., 2009).In fact, some of the best selling drugs used to treat high blood pressure, namely captopril and analogues, which act by inhibiting the angiotensin converting enzyme (ACE), were developed using a rational chemical synthesis approach using a pentapeptide toxin expressed in the venom of the Brazilian pit viper Bothrops jararaca as a model (Ferreira, 1965;1985;Ondetti et al., 1977;Cushman and Ondetti;1991).The therapeutic potential of venom peptides have also been investigated with regards to their pharmacology effects on ion channels and neural receptors (Lewis and Garcia, 2003).For example, an N-type voltagesensitive calcium channel blocker peptide, isolated from the marine mollusk Conus (Conus magus) , w a s r e c e n t l y a p p r o v e d b y t h e U S F o o d a n d D r u g A d m i n i s t r a t i o n a s a d r u g (ziconotide/ Prialt®) for the treatment of severe chronic pain (Schmidtko et al., 2010).
To study the mechanistic specificities of a given toxin toward a molecular target, milligrams of pure polypeptides are usually required, which is not always easy to obtain from certain venomous animals.This is particularly true for purified molecules from the venom and saliva of small creatures such as scorpions, spiders, wasps, poison worms, and hematophagous animals.Consequently, a powerful approach used to investigate the therapeutic potential of animal venoms is based on proteomics and molecular cloning techniques.Information about proteomic-based characterization of animal venom (venomics) can be found elsewhere (Calvete et al., 2007;2009;Escoubas et al., 2008;2009).This chapter reviews select topics on molecular toxinology, which include not only the cloning and recombinant expression of a single toxin but also the receptor-guided highthroughput screening of polypeptide venom libraries.

Molecular cloning of animal toxin genes
DNA cloning, or molecular cloning, is the process of constructing recombinant DNA molecules, transferring them in a given host cell and making copies of the inserted DNA, usually genes or the product of their transcription, i.e., messenger RNAs (Watson et al., 2008).For cloning, all that is necessary for propagation of cloned DNA is the piece of DNA of interest (i.e., insert DNA) from a particular source, a vector (small molecule of DNA capable of self-replication and containing a selectable marker), and restriction and modifying enzymes used to cut and join the insert and vector DNA together.Once recombinant DNA molecules are prepared in vitro, a host cell, usually Gram-negative bacteria Escherichia coli is transformed with the engineered vector and propagated millions of times to produce large quantities of cloned DNA.When a collection of thousands of DNA sequences is cloned instead of a single piece of DNA, it is referred to as a library.Essentially, two main types of libraries have been prepared and utilized in toxin research: www.intechopen.comMolecular Toxinology -Cloning Toxin Genes for Addressing Functional Analysis and Disclosure Drug Leads 165 complementary or copied DNA (cDNA) and genomic libraries.The cDNA library is designed to represent the transcriptome, i.e., the pool of messenger RNAs (mRNA) molecules that is produced by a cell type, tissue or organ in given time or metabolic condition.Therefore, it represents a snapshot of cell status, even if some transcripts (mRNAs) may be constitutively expressed.Thus, libraries of cDNAs or expressed sequence tags (ESTs) can be highly useful when the focus of a particular research project encompasses interrogative studies of gene expression profiles or differential gene expression.Using cDNA and EST sequencing, the gene expression profiles of the venom glands from several species of poison animals have been analyzed, including snakes (Zhang et al., 2006;Wagstaff and Harrison, 2006;Pahari et al., 2007;Cassewell et al., 2009;Neiva et al., 2009;Georgieva et al., 2010;Durban et al., 2011;Jiang et al., 2011, Rokyta et al., 2011), scorpions (Schwartz et al., 2007;Ma et al., 2009;Ruiming et al., 2010;Morgenstern et al., 2011), spiders (Chen et al., 2008Fernandes-Pedrosa et al., 2008;Gremski et al., 2010;Jiang et al., 2010), platypus (Whittington et al., 2099;2010), conus (Pi et al., 2006;Hu et al., 2011), and jellyfish (Yang et al., 2003).Differential gene expression of venom gland libraries has been used to investigate the molecular diversity of venom polypeptides.For example, Morgenstern and collaborators (2011) have reported that a significant difference exists in the transcriptome of resting venom glands from the buthid scorpion Hottentotta judaicus in comparison with the gland that is actively engaged in regenerating its venom.The transcriptome profile of a replete (resting) venom gland is rich in low-abundance toxin transcripts and tends to predominantly consist of open reading frame (ORF) sequences encoding toxins acting on voltage-and calcium-activated potassium ion channels.To perform such a study, a cDNA library was prepared from the venom gland of scorpions that had not been milked or induced to produce venom and was then compared with a cDNA library prepared from the venom gland of milked specimens in which the venom glands were committed to replenishing the venom pool of transcripts and toxins.This technique, termed cDNA subtraction, which will be described in the next section, and the resulting subtracted cDNA library, are generally used for the screening of cDNAs corresponding to mRNAs differentially expressed or regulated.The differential pattern of transcripts in resting and regenerating venom glands revealed an important aspect to be considered when studying transcriptomes of venomous animals and suggests which strategy should be taken into account to prepare cDNA libraries.In practical terms, most cDNA libraries are prepared after milking the venom from poisonous animals to empty the venom gland and induce the synthesis of total RNA prior to mRNA purification (Rottenberg et al., 1971;Rádis-Baptista, 1999).Thus, by examining full or subtracted cDNA libraries, as exemplified for scorpion venom glands, qualitative and quantitative differences are detected, particularly when one wishes to know the constitutive or induced venom transcriptomes for comparison with the corresponding proteomes (Ma et al., 2010).In fact, a combination of transcriptome analysis, i.e., cDNA sequencing, with mass spectrometry represents a useful alternative to characterize the animal venom, as was utilized for Cone marine snail venom analysis, and compare the inter-and intra-species variation that exists among venom peptide libraries (Gowd et al., 2008).In molecular toxinology, the data obtained from genomic DNA library analysis seems, at first glance, less informative than those obtained from cDNA libraries due to the relative static nature of genomes.However, genomic libraries provide information about gene number, diversity and organization.For example, using a PCR homology screening method, a crotamine paralogous toxin gene, crotasin, which is 2.5 kilobase (kb) long and organized in three exons intervened by two introns, was identified in a rattlesnake genomic DNA bacteriophage library; this suggested that gene duplication and accelerated independent evolution operated in the diversification of crotamine/crotasin genes (Rádis-Baptista et al., 2004).Such phenomena of gene evolution is a recurrent theme in toxinology (Kordis and Gubensek, 2000;Fry et al., 2009) and genomic libraries configure a good technical resources to retrieve such informations.Genomic DNA libraries constructed in bacterial artificial chromosomes (BACs) for Bungarus multicinctus and Naja naja, two old world toxic elapid snakes, were screened with probes for four major families of toxins, three-finger toxin (3FTx), phospholipase A2 (PLA2), Kunitz-type protease inhibitor (Kunitz) and natriuretic peptide (NP), and results showed 3FTx as the major toxin gene in elapidae venom (Jiang et al., 2011).These 3FTx genes are also composed of three exons and two introns in a region of approximately 2.5 kb (Tamiya and Fujimi, 2006), and they were shown to represent five putative tandem duplicates in B. multicinctus and seven in N. atra, thereby suggesting that tandem duplications has also contributed to the expansions of toxin multigene families in these two elapids (Jiang et al., 2011).Interesting, genome analysis of platypus (Ornithorhynchus anatinus) coupled with the transcriptome of its venomous apparatus revealed 83 novel putative platypus venom genes, distributed among 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish).A number of these platypus venom toxin families are expressed in tissues other than the venom gland, as observed earlier with snake venom toxins (Whittington et al., 2010).A glimpse into the realm of molecular toxinology reveals that recombinant DNA technology is essential for the analysis of the complex pharmacological effects of venom toxins and their potential biomedical and clinical applications.In the next section, the current and potential molecular techniques used for investigating toxins in a high-throughput manner are described in further detail.

Construction of a venom gland cDNA library
The basic steps for constructing a cDNA library include the following: (1) excision of venom glands from the poisonous animal of interest, (2) preparation of total and messenger RNA for cDNA cloning, (3) synthesis of cDNA from mRNA by reverse transcription, (4) selection of a vector, plasmid or bacteriophage (phage) for cloning and propagation of cDNA libraries, (5) sequencing of all cDNA libraries or screening for desired clones before sequence analysis, and (6) validation of cDNA clones for functional analysis.Synthesis kits for convenient construction of cDNA libraries are commercially available, and detailed information of a particular system can be obtained from sales representatives at companies such as Clontech Laboratories, Inc (Mountain View, CA-U.S.A.) and Stratagene (presently, Agilent Technologies, Inc., Santa Clara, CA-U.S.A.).Total RNA is typically purified from the tissue of choice, from which the sequences of interest are more abundant, using the single-step acid guanidinium thiocyanate-phenolchloroform method of Chomczynski and Sacchi (1987), by which cells are quickly disrupted, their components are solubilized, and the endogenous RNase is simultaneously denatured using guanidinium salt.Although such methods of RNA extraction and purification are very effective and reliable, caution should be taken when the sources of RNA extraction are tissues rich in lipids, polysaccharides and proteoglycans.In such cases, modification of the method, such as by introducing an organic extraction step and changing in the condition of RNA precipitation, can counteract the interference of these contaminants that would otherwise inhibit reactions of reverse-transcription.Total RNA purified by this single-step method is not only used for mRNA -or poly(A) + RNA -purification and for cDNA synthesis but also for applications such as northern hybridization, RNase protection assay and dot/slot blotting (Sambrook and Russel, 2001).In general, tissues such as snake venom glands and sea anemone tentacles yield between 3 and 10 µg/mg of RNA.Once the total RNA has been purified, the quality and integrity must be analyzed.The quality is assessed spectrophotometrically by analyzing the A 260 /A 280 ratio, which should be between 1.8 and 2.0.RNA quality is confirmed by running an aliquot of total RNA preparation using denaturing formaldehyde agarose gel electrophoreses (stained with ethidium bromide) to observe the ratio of 28S and 18S ribosomal RNA.The theoretical 28S:18S ratio for eukaryotic RNA is approximately 2:1, but this ratio might be different with RNA extracted from tissues of other organisms.An alternative to denaturing formaldehyde agarose gel electrophoresis is the analysis of total RNA in a microfluidic device, by which RNA integrity and concentration are automatically assessed.Howev e r , a s i t i s m o r e c o n v e n i e n t a n d l e s s laborious and inexpensive, gel-based analysis of total RNA quality can be still useful, when formamide is used with RNA samples and TAE agarose gel instead of formaldehyde as denaturing agent (Masek et al., 2004).Traditionally, cDNA libraries are prepared using mRNA, and not total RNA, to promote the reaction of reverse transcription.Thus, methods of preparing high-quality mRNA should be applied.These methods are based on the presence a 3' end polyadenosine tail found in most eukaryotic mRNAs.Short oligonucleotides of deoxythymidine (18 to 30 nucleotides in length) -oligo(dT) 18-30 , immobilized on cellulose or linked to biotin, form a stable hybrid with the poly(A)+ tail of mRNA in the presence of a high concentration of salt.The polyadenylated RNAs are denatured (at 70-72ºC for 5-10 min), allowed to hybridize with the oligo(dT) and separated by affinity (column chromatography) or captured (with streptavidin-coated paramagnetic beads) from the other RNAs (ribosomal RNA, transfer RNA), which are washed away, and the RNA is eluted with a low-salt buffer.The poly(A) + RNA pool is quantified by ultraviolet (UV) spectrophotometry and then used for cDNA synthesis.It is important to note that only 1 to 5% of total RNA constitute poly(A) + RNA, and RNAs that represent less than 0.5% of the total mRNA population of the cell are referred as 'rare' mRNAs.Therefore, when a cDNA library is being prepared, it should be comprehensive enough to include clones that represent 'low abundance' or 'rare' transcribed sequences (mRNAs).This is estimated the following formulas: where N is the number of clones required, P is the probability (usually 0.99), and 1/n is the fraction of the total mRNA that is represented by a single type of rare mRNA (Sambrook and Russel, 2001).
where n is the number of molecules of the rarest mRNA in a cell and T is the total number of mRNA molecules in a cell.The desired base (B) is the number of clones that should be screened to achieve a 99% probability that a cDNA clone will exist in the library (Ausubel et al., 1998).
Often, when 500,000 to 1,000,000 independent cDNA clones are present in an unamplified library, at least one copy of every mRNA should be present in the library.In more practical terms, to obtain a representative cDNA library with a high probability of finding a clone harboring a rare transcribed sequence, 1 to 5 µg (or less) of poly(A)+ mRNA is usually sufficient.To synthesize cDNAs, commercially available RNA-dependent DNA polymerases (reverse transcriptase, RT), derived from avian or murine retroviruses, catalyze the addition of deoxyribonucleotides to the 3'-hydroxyl terminus of a primed RNA-DNA hybrid, for which a deoxyoligonucleotide, such as an oligo(dT), a pool of random hexamers or a specific sequence, is used as a primer and the mRNA is used as template.This action of extension is referred to as reverse transcriptase 5'→3' DNA polymerase activity, and the product is a hybrid molecule composed of a single-stranded RNA (the mRNA) and a single-stranded cDNA (ss-cDNA).The non-engineered RTs have two additional catalytic activities and are consequently multifunctional enzymes.These RTs display a low level of DNA polymerase activity but a considerable 3'→5' or 5'→3' exonuclease processivity of RNA degradation in an RNA:DNA hybrid, or RNase H activity (Ausubel et al., 1998).Although the RNase H activity of RTs is useful in molecular biology (e.g., selective destruction of parts of an RNA molecule), it has been eliminated from most recombinant engineered enzymes commercialized for research to avoid the degradation of an mRNA template and improve the yield of synthetic cDNA.It is important to understand the mechanism by which RT functions, given the final product of this step.ss-cDNA is a useful starting material for other specific techniques of cloning, such as 3'-and 5'-RACE (rapid amplification of cDNA ends), reverse transcription coupled with PCR (RT-PCR), and quantitative real-time PCR (qPCR).One critical step in the preparation of ds-cDNA for cloning is the choice of primers for synthesis of first-strand cDNA.Primers used for cDNA synthesis include (1) oligo (dT), a 12-18 nucleotide oligo that binds to the poly(A) tail at the 3' end of mRNA; (2) primer-adaptors that contain a homopolymeric tract at the 3' end and a restriction site; (3) primers linked to a plasmid; and (4) random primers (Sambrook and Russel, 2001).However, the most popular methods for cDNA library construction are based on priming cDNAs with an oligo(dT) adaptor for directional cloning and homopolymeric priming of second-strand cDNA; this latter technique is commercially known as "switching mechanism at 5' end of RNA template -SMART".Once the first-strand cDNA synthesis is accomplished (typically for 60 min, at 37-42ºC), other modifying enzymes are employed, such as RNase H for introducing nicks into the RNA molecule of the mRNA:cDNA hybrids; E. coli DNA polymerase I for extending the 3' end of RNA primers (generated by RNase H activity on mRNA moiety) and replacing the fragments of mRNA in the mRNA:cDNA hybrids; bacteriophage T4 DNA polymerase for polishing the double-stranded cDNA (ds-cDNA); and T4 polynucleotide kinase for phosphorylating 5'-hydroxyl groups on the ends of the ds-cDNA for ligation of linkers or adaptors, which is accomplished by T4 DNA ligase.Methylases (e.g., Eco RI methylase) that catalyze the methylation of internal cleavage sites in ds-cDNA and thus protect synthesized cDNA from restriction enzymes are generally used for linker-adaptor digestion in the last steps of preparation of cDNAs for cloning.It is interesting to note that steps of phenolchloroform extraction and ethanol precipitation are necessary for cleaning up the ds-cDNA by removing enzymes, buffers and reaction components (Ausubel et al., 1998;Sambrook and Russel, 2001).
Another important step when preparing a cDNA library is the size fractionation of cDNA molecules.Size exclusion chromatography (e.g., Sepharose CL-4B) is used for fractionation of cDNAs and allows for the elimination of linkers or adaptors, enzymes, and buffer components, which would be detrimental in the subsequent steps of cloning.Moreover, with this procedure, truncated cDNAs that arise from incomplete first-or second-strand cDNA synthesis are discarded and sequences >400-500 nucleotides are conveniently selected for construction of 'high molecular' weight libraries.Researchers dealing with venom gland cDNA libraries should note that a high number of toxins (e.g., cardiotoxin, crotamine-like and sarafotoxin) are encoded by short nucleotide sequences, and thus, it is advisable to prepare a low molecular weight sub-library from venom gland cDNAs.As a rule of thumb, the step of size fractionation is accomplished after the cDNA synthesis has been completed and just before the ds-cDNA is cloned into a vector.Figure 1 displays a typical agarose gel electrophoresis analysis of size fractionation of ds-cDNA from an animal venom gland.Both popular methods of cDNA synthesis for library construction, priming mRNA with an oligo(dT) adaptor and homopolymeric priming of second-strand cDNA, produce cDNAs for directional cloning.Thus, the choice of vectors and strategy for cDNA cloning is selected based on the approach for ds-cDNA synthesis.With these two methods, adaptors and linkers at the 5' and 3' ends of cDNA molecules, containing restriction sites, are catalytically cleaved with two different enzymes, thereby producing cDNAs with distinct cohesive stick terminals.Vectors that are used for cloning cDNA include bacteriophage (phage) and plasmids.Today, either vector is appropriate for the preparation of libraries in the range of 10 6 to 10 7 independent recombinant clones when high-quality phage packaging extract and high-efficiency electrocompetent E. coli are used for the construction of phage and plasmid cDNA libraries.Examples of vectors currently used are the phage-engineered plasmids (phagemids) λZAP, λZAPII and λZAP Express (Stratagene/Agilent Technologies, Inc., Santa Clara, CA-U.S.A.) and the plasmids pcDNA3.1,pDNR and pSMARTer (Clontech Laboratories, Inc., Mountain View, CA-U.S.A.).The phagemid vectors are composed from the genomes of high-efficiency infectious bacteriophage particles, from which a high number of recombinant clone can be obtained, and allow for the in vivo recovery of plasmids harboring the inserted cDNAs.These vectors are commercially available either linearized or as phage arms, and the linear plasmid can be prepared in advance with the appropriate restriction enzymes, although the process of vector preparation is relatively laborious.Rádis-Baptista and collaborators (1999, Kassab et al., 2004, Neiva et al., 2009) have successfully prepared venom gland cDNA libraries in both λphage and plasmids with a high number of independent recombinant clones and satisfactory insert size averages.Libraries prepared in these ways are then amplified, and the clones are pooled or stored in microwell plates and maintained at -80ºC for screening.Several molecular biology techniques are typically employed to screen for a specific cDNA sequence of interest.In molecular toxinology, a convenient and practical protocol involves the 'PCR homology screening' technique (Israel 1993;1995;Radis-Baptista, 1999).Given that families of toxins have divergent members with conserved motifs, the nucleotide or amino acid sequences of known toxin representatives are multi-aligned, the consensus segment is selected, and sequence-specific oligonucleotide primers are synthesized.Hundreds of small aliquots from the cDNA library are separately propagated in 96-wells plates and analyzed by conventional PCR using a combination of gene-specific and vector primers.The pools of clones that are scored as positive are subdivided into a reduced number of clones per well, propagated, and re-screened by PCR.This iterative procedure is repeated until 100% of all clones in a single well score positive for the gene of interest.PCR homology screening not only facilitates the identification of clones of interest but also generates amplicons that can be easily and directly cloned for sequencing analysis.Moreover, PCR homology screening seems to be less laborious and tedious than screening cDNA libraries with labeled oligonucleotide probes, which requires replica plates, support membranes, and radioactive or fluorescent probes.As most vectors incorporate elements for functional analysis of cloned inserts, cDNAs from a library can be screened by detecting a biological activity as a result of recombinant protein expression.However, this approach is not commonly used for screening toxin cDNA clones because of the low efficiency of toxin refolding in vitro, which is due to the high number of disulfide bonds that toxin molecules contain.Phenotype coupled to genotype-based screening of cDNA libraries is best achieved by alternative molecular techniques of protein-protein interaction, as described in the next sections.In figure 2, the main steps involved in the construction of cDNA and genomic DNA libraries are summarized.Fig. 2. Overview of main steps for preparing venom gland cDNA and genomic DNA libraries

Subtractive and genomic libraries in animal toxin research
The subtracted cDNA library is very convenient for comparative studies of transcript expression.For example, when comparing the pattern of expression in venom glands of two distinct poisonous animals from specific geographical localities or the expression profiles of resting (replete) and regenerating venom glands, the production of a subtracted cDNA library is applicable (and recommended).The principle of subtractive cloning relies on nucleic acid hybridization, by which nucleotide sequences differentially expressed in one cell or tissue type (the tracer) are hybridized to a complementary nucleic acid pool isolated from the cell or tissue that is not expected to express the sequence of interest (the driver) (Sagerströmet al., 1997;Ausubel et al., 1998).An excess of driver, at least 10-fold higher than the tracer, is prepared from the cell or tissue lacking the sequences of interest.The driver (also designed [-]) and tracer (bait or [+]) nucleic acid pools are hybridized, and only nucleotide sequences common to driver and tracer form hybrids. Driver-tracer hybrids are removed, and unhybridized sequences are utilized for successive rounds of subtraction for maximal removal of sequences that are common to both tracer and driver.The enriched unhybridized sequences are prepared for cloning, and the tracer-specific clones subsequently constitute the subtracted library.There are at least four strategies applied for subtractive cloning for which both DNA and RNA can be used as the driver and tracer.However, each subtractive cloning strategy has associated advantages and drawbacks, which should be considered when designing the experiment.For example, it is important to consider the source and amount of material that will serve as the driver and tracer sequences, whether an amplification step is necessary, the kind of molecules that will serve as tracer (e.g., first-strand cDNA, ss-DNA or ds-DNA) and as driver (poly(A) + RNA, RNA, ss-DNA or ds-DNA).
Taking these points into consideration, the strategy of subtraction can be chosen from the usual schemes, categorized as the basic PCR-based technique, library-library hybridization and positive selection.With the basic subtractive cloning technique, cDNA serves as the tracer and mRNA as the driver.A large amount of tissue is required when performing this procedure, for which the subtractive hybridization can be performed only twice.With the PCR-based strategy, a small amount of starting material (ss-cDNA or ds-cDNA) can be used, and multiple rounds of subtraction are feasible.With library-library subtraction, collections of ss-cDNAs are used, and full-length sequences are obtained; however, the procedure is not easy to repeat.In positive subtractive selection, also known as the cohesive restriction sites method, an excess of [-] cDNA sequences is digested with restriction enzymes to produce blunt-ended fragments and then mixed with sonicated [+] cDNAs.After hybridization, only clonable cDNAs represent the tracer-specific sequences (Sagerströmet al., 1997).The literature describing the application of subtractive cloning for the investigation of differentially expressed genes is plentiful (for example, Lockyer et al., 2008;Zhang et al., 2008;Lefèvre and Murphy, 2009;Li et al., 2010;Matsumoto et al., 2011;Chengxiang et al., 2011;Liu et al., 2011;Padmanabhan et al., 2011), while only a few examples have been reported in the field of molecular toxinology (Baek et al., 2009;Baek and Lee, 2010;Morgenstern et al., 2011).As mentioned before, another type of library typically used for studies of toxin genes is the genomic DNA library (Figure 2).As with the first step in library construction, the tissue is surgically excised from anesthetized animals (e.g., liver from venomous animals) or cells (e.g., blood), which are collected from the organism of interest without killing it, and high molecular weight genomic DNA is then purified.Genomic DNA inserts are then prepared by controlled digestion with rare cutting restriction enzymes, producing blunted-end DNA fragments, which are then linked to a selected vector.Vectors for cloning genomic DNA should have a high capacity to accommodate large fragments (in the range of 20 to 40 kb or more).These high-capacity vectors include, listed in order of capacity, cosmid (30-45 kb), bacteriophage P1 (70-100 kb), bacterial artificial chromosome, BACs (120-300 kb) and yeast artificial chromosomes (YACs; 0.25 to 1.2 Mb) (Ausubel et al., 1998, Sambrook andRussel, 2001).Recombinant YACs are produced by digesting the vector with selected restriction enzymes and ligating the restriction-digested (and size-selected) genomic DNA into the left and right arms of the YAC vector.The YAC libraries are then transformed into yeast (Saccharomyces cerevisiae) and then utilized for the genetic screening of clones of interest.The YAC vectors contain restriction sites, a centromere (CEN4), an autonomously replicating sequence (ARS1), two selective markers, and telomeric sequences (TEL) at the vector terminus.Such features allow recombinant YACs to be autonomously replicated, segregate between daughter yeast cells and, due to their stability, integrate into host chromosome.Bacterial artificial chromosomes are circular, double-stranded DNA, low copy number vectors derived from E. coli fertility (F) factor, a large plasmid responsible for carrying genes encoding proteins involved in replication, partition and conjugation.Genetic engineered BACs are commercially available, and manipulation of this vector for preparing genomic libraries is straightforward in comparison to YAC libraries.In such cases, restrictiondigested genomic DNA fragments, selected by size, are ligated into linearized BAC vectors and then transformed into E. coli by electroporation.BAC libraries are then stored frozen until arrayed for clone screening by hybridization with oligonucleotide probes or PCR primers.Cosmids are conventional plasmids that have been engineered to allow for cloning of large pieces of DNA.Cosmid vectors contain one or two selectable marker (e.g., neo r and amp r ), a plasmid origin of replication (ori), restriction cloning sites, and one or two cohesive end sites (cos) from phage λ, which are essential for packaging of recombinant viral genome into infective phage particles.Restriction-digested genomic DNA and the vector are ligated, and the resultant linear concatenated recombinant molecules are in vitro packaged into phage heads.The infectious recombinant λphage injects the cosmid DNA into susceptible E. coli host cells, where the host's ligase covalently join the complementary sticky ends of the cosmid vector, thereby producing circular molecules that replicate as plasmids.Methods of screening genomic DNA cosmid libraries include hybridization of colonies plated on membranes (replica filters) and PCR-based techniques.As cosmids are λ phage-derived vectors that possess high cloning efficiency and high capacity for incorporating relatively large DNA fragments, they represent excellent vectors for the construction of complex genomic DNA libraries from venomous organisms (e.g., Nobuhisa et al., 1997;Beye et al., 1998;Fujimi et al., 2003;Rádis-Baptista et al., 2004).These libraries have been used for the identification of toxin genes and have led to reports describing their organization, comprehension of molecular evolution and diversification of toxin genes, as well as studies of comparisons of genomes and basic genetics.Genomic DNA libraries of venomous animals also constitute real archives of a particular biological resource, by which genomic information can be retrieved at any time regardless of how difficult it is to find and capture the organism in nature.The first step for constructing a genomic DNA library involves the preparation of high molecular weight (HMW) genomic DNA.In the basic protocol, the tissues of interest are quickly frozen in liquid nitrogen and pulverized.The tissue powder is then transferred to a solution containing proteinase K and sodium dodecyl sulfate (SDS) and incubated until the cellular protein is completely degraded and the nucleic acid is released.The digest solution is extracted by phenol/chloroform/isoamyl alcohol, and the HMW genomic DNA is precipitated with ethanol, fished (to avoid sharing forces and DNA rupture), dried, and finally resuspended in aqueous buffer.Genomic DNA prepared in this manner is sufficient (size in the range of 100-150 kb) for cloning into phage and cosmids when precautions to minimize DNA degradation are taken (Ausubel et al., 1998;Sambrook and Russel, 2001).After purifying HMW genomic DNA, cosmid inserts are partially digested with rare cutting restriction enzymes (e.g., SalI and SpeI), dephosphorylated with calf intestinal phosphatase (CIP), and fractionated by size using agarose gel electrophoresis.Standard agarose gels are convenient when separating genomic DNA fragments in the range of 0.5 to 25 kb, whereas pulsed-field agarose gels are required to resolve molecules ranging from 10 kb to 2 Mb.Partially digested, dephosphorylated genomic DNA is ligated to cosmid arm DNA using bacteriophage T4 DNA ligase, and the ligated genomic DNA-cosmid vector is then incubated with packaging extract, which contains all proteins required to produce infectious λ phage.E. coli host cells are transduced with the packaged cosmids and then spread onto selective agar plates.The number of λ phage plaques is counted, the efficiency of cloning is calculated, and the average size of the inserts is analyzed by restriction endonucleases or agarose gel electrophoresis.In general, the efficiency of cloning is in the order or 10 5 to 10 7 plaque forming units (pfu)/µg of genomic DNA.The genomic DNA cosmid library is titered by plaque dilution assay and amplified for storage.Cosmid libraries are stored in aliquots and preserved at -80ºC in either 7% dimethyl sulfoxide or 15-20% sterile glycerol.Additional information regarding the cloning of large genomic DNA fragments for preparation of libraries can be obtained from dedicated companies such as Agilent Technologies, Inc. (Santa Clara, CA-U.S.A.).In addition to the construction of genomic DNA libraries, there are other simpler techniques for cloning toxin genes and examining their structural organization, such as direct clone by PCR-mediated amplification based on specific primers probes for cDNA sequences (Jiqun et al., 2004;Zhijian et al., 2006, Mao et al., 2007) and 'genome (DNA) walking' (Gendeh et al., 1997;Legros et al., 1997;Afifiyan et al., 1999;Jeyaseelan et al., 2003).

Molecular techniques of protein interaction for target discovery of animal toxins
Polypeptide toxins initially act on target cells and tissues by interacting with a particular biomolecule (e.g., membrane lipids, proteoglycans, ion-channels, glycoprotein and integrin receptors).Protein-protein interaction techniques designed to assess the association of proteins in mammalian cells can be used to analyze the mechanism by which toxins act to intoxicate organisms at the molecular (protein) level and to identify ligands with high specificity and selectivity for a given partner (receptor).These techniques used for the identification of protein-protein interactions include the fusion of a protein of interest with glutathione-S-transferase (GST fusions) and analysis by far-western or pull-down (Einarson & Orlinick, 2002), co-immunoprecipitation and mass spectrometry-based protein characterization (Adams et al., 2002), yeast and bacterial two hybrid selection systems (Serebriiski & Joung, 2002), or phage display (Goodyear & Silverman, 2002).These represent robust and promising techniques for the identification of therapeutic polypeptides due to the direct linkage between the molecular cloning of genes (genotype) and analysis of the biological activity of expressed peptides and proteins (phenotype).As described by E. Golemis (2002), using such techniques, three types of information may be obtained: (1) the identification of every possible set of interacting proteins for a target (protein of interest); (2) the physiological significance of such interactions once partner proteins have been identified; and (3) the validation of the physiological role of interacting proteins with the systematic use of modulators.By making use of this protein-protein interaction technique, target-driven identification of novel animal polypeptide toxins in the context of clinical application is enhanced.With this technique, instead of cloning a single DNA sequence into a vector, which produces the bait chimera, a library of thousands of cloned DNA is best suited to be scrutinized though rounds of target-driven (bait) specific binding.Consequently, methods used to examine protein-protein interactions are extremely promising for the screening of animal toxins in a customized approach, i.e., based on targetdriven selection.

Phage display and its potential to interrogate animal toxin libraries
In phage display, or peptide display, the polypeptide library of interest is fused to capsid protein and expressed on the surface of a bacteriophage (or phage), which becomes 'decorated' with the recombinant coat protein and is therefore available for analysis of receptor and ligand binding.The "phage display" technology is a robust strategy used to identify and investigate highly specific protein-protein interactions and to find or model novel ligand receptors, as there is a direct physical linkage between the polypeptide expressed in the capsid (phenotype) and the genetic information (genotype) (Smith, 1985).The physical link between genetic information and recombinant protein expression in each single phage particle allows for iterative rounds of selection of clones with particular capabilities, i.e., panning, coupled with steps of amplification of a clone of interest or sublibraries.Rare ligand-binding clones are enriched based their specificity and rescued from complex libraries (over 10 10 recombinant phage particles) (Goodyear & Silverman, 2001).Polypeptide molecules of diverse types and properties have been successfully displayed on the surface of filamentous phage and selected by different means.As reported by Goodyear and Silverman (2002), these polypeptides include enzymes (McCafferty et al., 1991), antibodies (McCafferty et al., 1990;Burton & Barbas, 1993;Winter et al., 1994, Zheng et al., 2005), short peptides and protein fragments (Smith, 1985;Cwirla et al., 1990, Scott and Smith, 1990, Petersen et al., 1995), cytokines (Gram et al., 1993), antigens (Crameri et al., 1994), and extracellular receptor domains (Chiswell and McCafferty, 1992;Wu et al., 1995).The surface display technology is also a robust molecular tool for the purpose of investigating the proteomic complexity of protein-protein interactions mediated by natural or artificial ligands.Natural ligands displayed on the phage surface include allergen libraries (Crameri & Walter, 1999, Crameri et al., 1994), carbohydrate and polysaccharides (Deng et al., 1994;Yamamoto et al., 1999), RNA binding proteins (Danner & Belasco, 2001), fatty acids and membrane lipids (Qiu and Marcus, 1999;Gargir et al., 2002;Nakai et al., 2005) and ligands for Gal80p (Hertveldt et al., 2003).Moreover, phage display technology can be used for the identification and characterization of novel ligand receptor-binding interactions in studies of structure activity relationships (SARs) (Qiu et al., 1999;Li et al., 2009;Bannister et al., 2011;Garbelli et al., 2011).In phage display (or peptide display) technology, segments of genes (selected or generated at random) or full-length cDNAs of interest are inserted in frame with a gene encoding one of the capsid proteins from lytic (T4, T7 or lambda phage) or non-lytic filamentous bacteriophages (M13, f1 or fd) (Goodyear & Silverman, 2002;Li et al., 2010).In the case of filamentous phage, the adopted strategy for phage display relies on the cloning of the nucleotide segment encoding the peptide sequence of interest in fusion with the genes for one of phage capsid proteins, typically protein III or VIII (pIII and pVIII, respectively).The choice of the display protein, whether it be pIII, pVIII or another bacteriophage coat protein, will influence the panning outcome in terms of either ligand avidity or binding affinity (Qiu et al., 1999;Fagerlund et al., 2008).The recombinant virus genome is then packed, and the library proteins are expressed as fusion (chimera) capsid proteins on the phage surface.The phage display library is typically screened by bait selection on immobilized supports.These supports include functionalized beads (chromatographic resins), multi-well plates or membranes, in which the bait protein is attached.As the partner protein (bait) can be localized in complex biological systems (cells, tissues and organs), the panning procedure can be performed in vivo with experimental animals as well as in vitro with isolated proteins, cells and tissues in culture (Michon et al., 2002;Kehoe and Kay, 2005;Valadon et al., 2006;Li et al., 2006;Zahid et al., 2010;Bahudhanapati et al., 2011;Kanki et al., 2011).Although there are numerous advantages associated with this technique, one potential limitation of phage display is the difficulty of correctly expressing cDNA libraries fused to capsid phage proteins.Technically, nucleotide-coding sequences are cloned into the 5' end of capsid genes to produce virus particles decorated with heterologous proteins fused to the N-terminus of coat protein.However, the insertion of stop codons and unpredictable reading frame shifts in the fused gene constructs cause interference with coat protein expression and decrease the efficiency of polypeptide surface display.Strategies have been developed by several researchers to overcome such technical issues and improve phage display technology.For example, one direct alternative is to express library polypeptides at the carboxyl terminus of filamentous phage coat protein and avoid interruption of the translated chimera (Jespers et al., 1995;Hufton et al., 1999;Brunet et al., 2002).Using a similar strategy, fusion of cDNAs to the 5'-end of 10B gene of T7 phage allows for the expression of the cDNA phage display library at the C-terminus of the capsid protein (Danner and Belasco, 2001).Using a coupled version to produce chimeric cDNA libraries and coat proteins for display, Crameri and Suter (1993) prepared a phagemid vector in which cDNA libraries were cloned into the 3' end of a c-Fos leucine zipper domain gene segment to produce polypeptide libraries fused to the carboxyl terminus of c-Fos, and the Cjun leucine zipper domain is expressed in fusion with lambda phage pIII.As a result, during phage assembly, a c-Jun and c-Fos heterodimer is formed, and polypeptide libraries are displayed at the C-terminus of pIII.Another strategy used to display cDNA libraries, which was developed by Caberoy and collaborators (2009), employs a modified version of T7 phage display.In this version, referred to as the T7Bio3C vector, a cleaved motif of human rhinovirus 3C protease was fused to the C-terminus of the capsid 10B protein, two GS flexible linkers and a biotin tag.Such vectors accommodate cDNA libraries in all three possible reading frames, improving the recovery of recombinant full-length cDNAs.Regardless of the polypeptide phage display library preparation method, panning (affinity selection) and clone enrichment represent critical aspects of the method.Although some technical problems may arise when cloning cDNA sequences for polypeptide display on bacteriophage capsids, they can be potentially solved by changing the strategies of DNA construction, and phage display is still an efficient, sensitive and indispensable method to investigate the diversity of peptide molecules in natural libraries expressed in the glands of venomous animals.Moreover, phage display is a powerful component from the arsenal of functional proteomics dedicated to elucidate protein-protein interactions, novel receptorbinding peptide ligands, and discovery of peptide drug leads, among other applications of protein chemistry.With such technical characteristics, the surface display technology can be easily adapted and automated to a array format of high-throughput screening, conditions that meet the productivity seen in genomic and proteomics approaches (Walter et al., 2001;Georgieva and Konthur, 2011) and applications in drug discovery and medicine (Sergeeva et al., 2006).Phage display platforms for cDNA cloning based on T7 phage biology are commercially available from Novagen/EMD4Biosciences (USA), by which technical literature can be consulted.

Yeast two hybrid system for investigating animal toxin interactions
Another powerful molecular biology technique suitable for characterization of specific protein-protein interactions and potentially useful for binding studies of animal toxins and their cellular or tissue targets is known as the yeast two-hybrid system (Y2H).The Y2H is an interaction trap technology designed for the analysis of the interaction between two known cloned proteins of interest or to screen a library for a gene encoding an unknown protein that interacts with a specific known target (the bait).The system works with chimeric proteins, which are prepared by fusing nucleotide sequences encoding a polypeptide of interest (e.g., a given animal toxin) with a DNA binding domain (DBD), resulting in the chimera DBD-toxin on one side, and the putative partner protein (or library) and transcriptional activation domain (AD), generating the AD-partner on the other side.The fused DNA sequences coding for both constructs are co-transformed into yeast (Saccharomyces cerevisiae), and the chimeric proteins are then expressed and transported to the nucleus.When two proteins interact, the DNA binding domain and transcriptional activation domain are brought together and then activate the transcription of the two reporter genes (Serebriiski and Joung, 2002).The DBD used is the LexA protein from bacteria that interacts in the nucleus with lexA operators located upstream of β-galactosidase and LEU2 (or HIS3) genes (the first and second reporters of the system).Positive yeast twohybrid clones with trapped interacting protein, which are plated on selective medium lacking leucine (or histidine) and containing X-Gal, are identified by the blue color and are selected for validation.A more detailed assessment of Y2H technology can be found in several articles published during the last two decades (Luban and Goff, 1995;Miller and Stagljar 2004;Lentze and Auerbach, 2008;Ratushny and Golemis, 2008;Suter et al., 2009;Brückner et al., 2009;Fields et al., 2009).Examples in the literature of yeast or bacterial twohybrid systems used for toxin research are still rare.Most studies in which experiments of interaction trap with two-hybrid systems involve toxins from organisms other than venomous animals, such as plant and microorganisms.For example, in wheat, a necrotizing toxin produced by P. tritici-repentis (Ptr ToxA) was shown to interact with a chloroplast protein involved in endocytosis in both ToxA-sensitive and ToxA-resistant plant cultivars (Manning et al., 2007).Using Y2H, Rádis-Baptista and collaborators (personal communication) analyzed the interaction of crotamine with neural protein targets.Crotamine is a low molecular weight cationic polypeptide found in the venom of the South American rattlesnake (Crotalus durissus terrificus) that, despite its toxic effect on animal tissues, has arisen as a promising template for drug development and biomedical applications (Kerkis et al., 2010;Radis-Baptista, 2011).When injected i.p. into mice, crotamine causes rapid (< 10 min) and specific hind limb paralysis.Thus, it was first hypothesized that a neural receptor was a target of crotamine, thereby triggering a toxic response.To address this hypothesis, a mouse nervous system cDNA library was constructed into a pB42AD plasmid vector, producing fusions between the transcriptional activation domain (AD) and library sequences, while the bait DBD-crotamine fusion was cloned into a pLexA plasmid.Both plasmids were obtained from the MATCHMAKER LexA Two-Hybrid System (Clontech Laboratories, Inc, Mountain View, CA-USA).After co-transformation of a suitable yeast strain (EGY48 [p8op-lacZ]) with the fusion plasmids, tests were made to ascertain that some protein partners interact.The preliminary results displayed that DBD-crotamine self-activated the expression of the βgalactosidase gene reporter.Initially, this mechanism was unclear, but later, Kerkis andcollaborators (2004, 2010) proved that crotamine was able to enter eukaryotic cells and bind specifically to the chromosomes, thereby validating a portion of the data obtained from the analysis with Y2H and crotamine.A scheme illustrating the cloning strategy for Y2H screening is presented in Figure 3.The bacterial two-hybrid (B2H) system is analogous to the Y2H system , except that one polypeptide of interest (e.g., from a library) is linked to a subunit of the E. coli RNA polymerase (RNA pol), while the other protein of interest (for example, the bait) is fused to a DNA-binding domain, but the other.Similar to Y2H, when a protein-protein interaction does occur, the DNA-binding domain (in bait-DBD fusion) recruits a RNA pol moiety to a weak promoter in the host E. coli, and transcription of a reporter gene is activated, thereby indicating a positive interaction trap (Goodyear and Silverman, 2001;Dove and Hochschild, 2004).

In vitro display techniques for investigation of animal toxins
One concern regarding phage display and two-hybrid systems is the involvement of living organisms (bacteria, phage and yeast) in the process of library preparation and selection.Therefore, true in vitro selection technologies have been developed by which the number of molecules that can be handled are not limited by cellular transformation efficiencies and directed protein evolution can be achieved independently of successive rounds of randomization (Amstutz et al., 2001).Two of these techniques are the ribosome display and directed evolution coupled to cDNA display.Like phage display and Y2H (or B2H), a physical link between genotype (RNA and DNA) and phenotype (expressed/displayed protein) exists in ribosome and cDNA display.In ribosome display, non-covalent ternary complexes, consisting of mRNA, ribosome and nascent polypeptide, which can fold correctly while still attached to ribosomes, are formed, which demonstrates the coupling of genotype and phenotype (Hanes and Plückthum, 1997).DNA library coding for particular proteins of interest is transcribed in vitro.The mRNA is then purified and used for in vitro translation.Because the stop codon has been removed from the protein encoding sequences in the DNA library, the ribosome stalls at the 3' end of the mRNA during in vitro translation, giving rise to a ternary complex of mRNA, ribosome, and encoded protein.In general, the protein is able to fold correctly on the ribosome because a carboxyl-terminal spacer had been genetically fused to it, thus allowing the protein of interest to fold outside of the ribosomal tunnel.High concentrations of magnesium and low temperature further stabilize the ternary complex.These complexes, which are formed during in vitro translation, can directly be used to select for the properties of the displayed protein.After affinity selection and elution from a ligand (immobilized or in solution), the mRNA is purified, reverse-transcribed, and amplified by PCR.Following successive rounds of selection, which enriches the ligands at a rate of 100-1000-fold per cycle, the pooled DNA can be cloned in an expression vector for identification and large-scale preparation of selected ligand (Hanes and Plückthum, 1997;Amstutz et al., 2001;Schaffitzel et al., 2002).Ribosome display has been used for in vitro selection of biologically relevant macromolecules such as antigenic epitopes (Yau et al., 2003;Lee et al., 2004;Yang et al., 2007), cell-surface receptor modulators (Milovnik et al., 2009), and enzymes (Amstutz et al., 2006;Quinn et al., 2008).cDNA display is a technology similar to ribosome display in which the ternary complexes are formed by the covalent coupling of mRNA, bearing puromycin at the 3' end via an oligonucleotide linker to the carboxyl terminus of nascent protein (Roberts and Szostak, 1997;Nemoto et al., 1997;Miyamoto-Sato et al., 2000).In cDNA display, as seen in ribosome display, the DNA library coding for particular polypeptides of interest is first transcribed in vitro.The mRNA is purified and ligated to the puromycin linker in the presence of T4 RNA ligase.The mRNA-puromycin linker is then translated in a cell-free system (e.g., reticulocyte lysate), and when the ribosome reaches the RNA-DNA junction, translation stops, and the puromycin moiety enters the peptidyl transferase site of the ribosome, thereby allowing for the formation of a covalent link between the puromycin linker and the nascent polypeptide.The covalent linked polypeptide and mRNA are rapidly purified from the ribosome by biotin-streptavidin capture, and cDNA is synthesized by reverse transcription.The purified complex, composed of the hybrid mRNA-cDNA and polypeptide, is then ready for affinity selection using the target molecule of interest (Yamaguchi et al., 2009).
In addition to affinity selection, both technologies of ribosome and cDNA display allow for in vitro directed protein evolution (Yanagida et al., 2010;Dreier and Plückthun, 2011).In each round of selection, conventional or error-prone PCR with non-proofreading Taq polymerase can introduce a number of mutations and consequently increase the diversity of nucleotide sequences (DNA library) and displayed proteins (Cadwell and Joyce, 1992;Schaffitzel and Plückthun, 2001).Recently, Naimuddin and collaborators (2011) applied cDNA display technology and directed protein evolution to engineer an elapidae snake three-finger toxin (3FTx) scaffold designed to identify modulators of interleukin-6 receptor (IL-6R).The three-fingers toxins are well conserved protein structures characterized by βsheets and three protruding loops (loops I to III) and are slightly distinct among different snake toxins and responsible, to a certain degree, for diverse biological activity and toxicity (Endo and Tamyia, 1987;Kini and Doley, 2010).Based on the work by Naimuddin and colleagues (2011) and previous studies (Yamaguchi et al., 2009), they were able to generate a 3FTx library containing 1.2 x 10 11 molecules by randomization of DNA sequences encoding all three loops of the Micrurus corallinus (coral snake) α-neurotoxin (MicTx3) as template and discover 3FTx-derived peptide ligand of interleukin-6 receptor.This reinforces the potential benefits of employing cDNA display for in vitro protein evolution and target-driven selection.Figure 4 depicts a schematic view of the process of cDNA display.

Heterologous expression of toxin genes in eukaryotic and prokaryotic systems
Although numerous classes of polypeptide toxins identified are expressed in relatively minute amounts in the venom glands of several animals, milligrams of such bioactive molecules are required for accurate biochemical and pharmacological characterization.Recombinant DNA technology can be utilized to prepare vectors for cloning and expression of the toxin gene of interest.In most cases, the expression vectors are plasmids that contain promoters that direct the synthesis of large amounts of mRNA (cDNA), sequences that encode genetic traits that allow vector-containing cells to be selected and sequences that increase the efficiency with which the mRNA is translated (Ausubel et al., 1995).For recombinant expression of proteins and polypeptide toxins, researchers can make use of prokaryotic and eukaryotic systems.The most utilized prokaryotic host is the bacteria E. coli, and the expression is directed by T7 promoters inserted into the vectors, as originally developed by Studier and collaborators (Studier and Moffatt, 1986;Studier et al., 1990).In this T7 promoter-driven system, a relatively small amount of T7 RNA polymerase provided from a cloned copy of T7 gene 1 is sufficient to direct high-level transcription from a T7 promoter in a multicopy plasmid, thereby producing the recombinant protein in a short amount of time (< 3 h) and in a quantity higher than 50% of the total cell protein.To improve the solubility of recombinant protein produced in E. coli and to facilitate the downstream process of purification, vectors are available that allow for the expression of fusion proteins, such as maltose-binding protein, glutathione-S-transferase, hexa-histidine and thioredoxin fusions (Ausubel et al., 1995).Recombinant fusion proteins are easily purified by affinity chromatography, and their extra portion ('tags' and 'carrier' protein) can be chemically or enzymatically removed via cleavage signals present in the protein sequence (Sambrook and Russel, 2001).Overexpression of recombinant protein can be toxic and detrimental to the host bacterial cell; therefore insoluble intracellular aggregates, known as inclusion bodies, are often formed.These inclusion bodies are composed of almost pure unfolded protein, which can be properly refolded after disruption by different means (Marston 1986;Marston and Hartley, 1990).Despite the numerous parameters that have been tested to avoid IB formation, such as promoter strength, codon usage and gene dosage, and the temperature of induced expression (Martinez-Alonso et al., 2009), Garcia-Fruitos andcollaborators (2005) have shown that overexpressed protein aggregation as inclusion bodies does not cause inactivation of enzymes and fluorescent proteins.Given this, when designing the experimental strategy, it is important to take into consideration not only the election of an appropriate lysis method but also the design of the necessary washing steps to isolate native protein and recover undisturbed active protein (Garcia-Fruitos, 2010).Prokaryotic expression systems are commercially available from several biotech companies, and their respective technical resources are easily assessed for additional information (e.g., EMD4Biosciences USA, Life Technologies/Invitrogen USA, GE Healthcare USA and New England Biolabs USA).A system that combines an insect virus (baculovirus) as a vector and cultured insect cells as a host has been utilized for the expression and production of heterologous protein.The baculovirus-insect cell system has proved to be an excellent choice for protein expression as it has several advantages, including the production the protein in high yield with the appropriate eukaryotic post-translational modifications (Luque and O'Reilly, 1999).The biological activity and similarity to native proteins offer a great advantage over conventional bacterial expression systems (Patterson et al., 1995).Vectors for expression of heterologous proteins in eukaryotic cells have also been developed, which allow for the production of recombinant polypeptides in mammalianderived cells, such as African green monkey kidney fibroblast-like (COS) cells (Warren and Shields, 1984) and Chinese hamster ovary (CHO) cells (Cockett et al., 1990;Kaufman et al., 1991), budding yeast (S. cerevisiae), fission yeast (Schizosaccharomyces pombe) and methylotrophic yeast (Pichia pastoris) (Trueman, 1995;Gellissen and Hollenberg, 1997;Li et al., 2005;Takegawa et al., 2009).A study was performed analyzing the performance of the five principal eukaryotic expression systems, including the stable expression of transfected adherent CHO cells, transient expression in mammalian COS cells, and baculovirus expression in invertebrate insect cells (Geisse et al., 1996).Each expression system has advantages and disadvantages that should be considered when selecting a method to prepare recombinant proteins.The expected yield, time required for production of the desired protein, necessity of protein refold and post-translational modifications, possibility of scale-up, and costs are examples of what should be taken into account when expressing a biologically active protein.When dealing with animal toxin, one should remember that most toxin families contain members with three, four or five disulfide bonds, which confer to each toxic polypeptide high stability and structural rigidity-properties that are necessary for proteins secreted into the venom.Therefore, systems or conditions that allow disulfide bond formation are much more advantageous.The successful heterologous production of animal toxin using the systems mentioned above has been reported.For example, with Pichia pastoris, several functional snake venom proteins have been expressed, such as the cystatin-like cysteine-protease inhibitors from the elapidae Austrelaps superbus (Richards et al., 2010); a venom P-II metalloproteinase (Jerdonitin) composed of metalloproteinase and disintegrin domains from T. jerdonii venom (Zhu et al., 2010); a thrombin-like enzyme (gloshedobin) from the venom of Gloydius shedaoensis in fusion with HSP-70 (Yang et al., 2009); the disintegrin domain of a metalloprotease from the green pit viper Trimeresurus albolabris (Singhamatr and Rojnuckarin, 2007); a thrombin-like enzyme (Ancrod) from the venom of Calloselasma rhodostoma (Yu et al., 2007); and a fibrinogenolytic serine protease from T. albolabris venom (Muanpasitporn and Rojnuckarin, 2007).With bacterial expression systems, recent reports have described a disintegrin (r-mojastin 1) from the venom of the mohave rattlesnake, Crotalus scutulatus scutulatus (Sánchez et al., 2010); gloshedobin (separately fused at its N terminus with three fusion partners) (NusA, GST, and TrxA) (Jiang et al., 2010); C-type lectins (BML-1 and BML-2) from the venom of Bungarus multicinctus (Lin et al., 2007); a c-type lectin (BJcuL) from the Bothrops jararacussu venom (Kassab et al., 2004); eretrin (an active spider toxin with penile erectile function) from the venom of Phoneutria nigriventer (Torres et al., 2010); and Huwentoxin-I, a small neurotoxin (33 amino acid in length) from the venom of the Chinese bird spider Ornithoctonus huwena (Che et al., 2009).Scorpion toxins have been expressed in baculovirus for the purpose of insect control (Gershburg et al., 1998;Rajendra et al., 2006).Yonamine and collaborators (2009) have transiently expressed a thrombin-like enzyme (gyroxin) with esterase activity from the venom of the South American rattlesnake C. durissus terrificus using COS-7 cells.This article was the first to report the functional expression of a snake toxin in a mammalian expression system.Such works provide guidance for researchers choosing a system and using a vector to express a toxin of interest for the characterization of a biological activity at the molecular level.

Conclusion
The scientific achievements that outline the biochemical basis of the transmission of genetic information culminated with a revolution in Biology with the advent of Molecular Biology, genetic engineering (or recombinant DNA technology) and revolutionized the field of Life Sciences.Influenced by genetic engineering, molecular techniques have been employed in fields as diverse as Environmental Science and Medicine, although all these fields work towards common goal of improving the quality of human life.The applications of recombinant DNA technology range from food analysis and process to clinical diagnostics and therapy.The holistic view that genome, transcriptome and the other associate 'omics' (encompassing metabolome, glycome, peptidome and pharmacogenomics) are responsible for the characteristics of a biological system and the status of an organism in conditions of good health or disease has radically influenced the way we are using chemicals, including medicines.In addition, new disciplines have emerged from the sequential 'revolutions' in the fields of Biology as well as Systems and Synthetic Biology.Methods that are based on the linkage between genotype (DNA, RNA and cDNA) and phenotype (expressed polypeptides), which are suitable for high-throughput evaluation, have become essential for yielding timely results for current challenges.Therefore, gene sequence libraries prepared and maintained in different formats (e.g., cDNA, genomic, and subtractive) are of paramount significance to understand cell function and to comprehend the entire organism.Moreover, focusing on healthy and wealthy humans, the society (and scientific community) may take advantage of the information contained in such libraries with the aim of producing better biotechnological goods and drugs.Moreover, genotype-linked phenotypebased libraries, such as phage/surface display, yeast and bacterial two-hybrid systems, and ribosome and cDNA display offer unprecedented technical capabilities to rapidly identify specific target-binding ligands with potential drug applications.Because polypeptides are involved in such interaction trap technologies, the gene (mRNA) of interest can be subcloned into a given expression vector, and the (therapeutic) peptide and protein of interest can be prepared in sufficient amounts required for clinical research and medical practice.In this context, animal toxins found in nature as complex mixtures in the venom of numerous species of organisms inhabiting distinct geographical location and belonging to various biomes, constitute the ultimate biological resources for drug discovery and development.Families of animal toxins contain proteins that evolved for millions of years as a result of positive Darwinian selection (accelerated evolution), thereby generating conserved protein scaffolds with distinct biological and pharmacological activities.In this way, a dozen animal toxins have been converted either into drugs and diagnostic tools or have served as templates for drug design.Consequently, the combined use of refined and robust molecular techniques, designed to assess the biotechnological potential of venom polypeptides and their precursors (genes and mRNA), will offer priceless rewards concerning scientific endeavors in toxin research.Thus far, as presented in this chapter, the application of molecular cloning techniques in toxin studies can link basic research of natural compounds to the applied research from pharmaceutical industries, which might be ultimately translated into the practical scientific answers for an inquiring biological world.

Fig. 3 .
Fig. 3. Basic steps involved in the preparation of an interaction trap assay with Y2H.

Fig. 4 .
Fig. 4. Depiction of the reiterative steps involved in cDNA display and main procedure for generating libraries for selective screening.