Molecular and adaptive evolution of Nep2 gene from carnivorous plant Nepenthes

Nepenthes get their nutrient by carnivory using their pitchers. A prey drowned in the pitcher fluid, will be digested by enzymes called nepenthesin, i.e. nepenthesin II. The structure of nepenthesin II-encoding gene might be related to the role of the enzyme. Therefore, the objective of this study was to examine the molecular and adaptive evolutions of Nep2 gene expressing nepenthesin II. We analyzed 29 Nepenthes species that represent most habitat types. Total DNA was extracted from silica-dried leaf samples and amplification of Nep2 gene was performed using degenerate primers. Homology searching was conducted using BLASTn, followed by computation of isoelectric point of the enzyme, and testing for positive selection using Mega 5. The result showed 29 DNA sequences of Nep2 gene have no introns. Intron-less Nep2 gene will produce nepenthesin II rapidly to digest the prey. The gene experienced significant positive selection on N. sumatrana, a species inhabits the lowest altitude habitats amongst Sumatran species. An obvious adaptive phenotype is the development of two unusual types of lower pitchers to obtain nutrient in lowland habitats. In conclusion, molecular and adaptive evolutions of Nep2 gene characterized Nepenthes as highly adaptable plants that actively respond to the environmental stress.


Introduction
Carnivorous pitcher plants Nepenthes specifically evolved to inhabit marginal environments with nutrient deficiency [1][2][3], and to exploit niches where non-carnivorous plant species are less equipped to compete successfully. They augment their nutrient uptake by carnivory with their pitchers [4]. Morphological adaptations, including wetness-dependent peristome and slippery wax crystals, encourage the prey to fall into the pitcher [5][6][7][8]. Then the body of the prey is drowned by the pitcher fluid [9,10] and subsequently digested by enzymes within the pitcher fluid [4].
Nepenthes secrete acid proteinases to digest the protein of their prey that trapped and drowned in their pitcher fluid [11] and absorb the digestion product as a nitrogen source [3,12]. The acid proteinases inside the pitcher fluid were identified as nepenthesins, which were distinguished into nepenthesin I and II [13]. Both enzymes have optimal activity at acidic pH and are most stable at pH 3 [13,14]. Moreover, nepenthesin I and II from N. distillatoria were quite different from each other in properties as they have different molecular masses and the activity and stability of both enzymes were different at certain temperature and pH. In addition, nepenthesin I and II from N. gracilis had only  66.6% identity of their amino acid sequences [13]. Despite their differences, these 2 acid proteinases are the only enzymes known to be specialized in prey digestion in the pitcher fluid of Nepenthes [14].
The nepenthesin is an aspartic proteinase (AP), making a family of protease enzymes that use an aspartate (Asp) residue for catalysis of their peptide substrates [13]. Aspartic proteinases are found widely in plants [15,16] and other living organisms including animals, fungi, bacteria and viruses [17,18]. In plants, APs are distributed in seeds, leaves and flowers [19], as well as in the digestive fluid of carnivorous plants and pitcher fluid of Nepenthes [13,14,20].
Some plant APs have been purified and well characterized, such as from barley called phytepsin [15,21], and from rice called oryzasin [22]. Those 2 plant APs were identified as intracellular vacuolar enzymes and shared a plant-specific insertion sequence in the middle of their DNA sequences. In addition, the cloning of AP homologs from the pitcher tissue of Nepenthes alata that also belong to the vacuolar aspartic proteinase, contain a so-called plant-specific insertion [20]. In contrast, nepenthesins from N. gracilis do not have any plant-specific insertion sequences. Instead, they have a specific insertion namely the nepenthesin aspartic proteinase (NAP). Therefore, APs from the pitcher fluid of Nepenthes plants are clearly belong to a novel subfamily of APs [13] The plant AP gene has undergone both the gain and loss of introns during molecular evolution. Rice AP oryzasin 1 gene comprises 14 exons and 13 introns [22]. Other plant AP gene from Fagopyrum esculentum (Genbank: AM422870), which is in the same order Caryophyllales with Nepenthes, also comprises the same number of exons and introns as oryzasin 1 gene. Non-plant AP genes have different composition of intron and exons, for instance, human cathepsin D [23], rat renin [24] and bovine chymosin [25] are all composed of 9 exons and 8 introns. Otherwise, yeast proteinase gene has no introns [26]. Therefore, the molecular evolution of APs as well as their structure-function relationships and physiological roles have become an interested field of study.
In the present study, we have tried to isolate the nepenthesin-encoding genes to study their molecular evolution and structure-function relationships, and to detect positive selection operates on the gene. For these purposes, we designed degenerate primers for the amplification of both nepenthesin I and II based on the alignment of sequences of the genes and their homologs available in the Genbank. However, we only succeeded to amplify the nepenthesin II-encoding gene, namely Nep2. The study of Nep1 of Nepenthes will be conducted in the future.

Amplification and sequencing of Nep2 gene
Total DNA was extracted from silica-dried leaf samples with a QIAGEN DNeasy Mini Plant Kit (Qiagen) following the manufacturer's protocol. Amplification was performed using 2 pairs of degenerate primers (table 2). Ex-Taq buffer and Ex-Taq DNA polymerase (Takara Bio) were used for the amplification of the Nep2 gene. The polymerase chain reaction (PCR) protocol consisted of an initial 90-s pre-denaturation at 96°C; 40 cycles of 45-s at 96°C (denaturation), 80-s at 58.5°C (annealing), and 70-s at 72°C (extension); and a final 7-min extension at 72°C.
The PCR products were cleaned using Wizard SV Gel and PCR Clean Up System (Promega) and were used for autocycle sequencing reaction following the manufacturer's (Beckman Coulter) instructions. Autocycle sequencing products were cleaned by ethanol precipitation. Both forward and reverse sequences were analyzed with a CEQ8000 automated sequencer (Beckman Coulter), using the same primers as for PCR. Six internal primers (table 2) were designed to get better sequences of the Nep2 gene. The DNA genomic sequences of the Nep2 gene were deposited in the Genbank (table 1).

Identification of the Nep2 gene and characterization of nepenthesin II
To identify the Nep2 gene, homology searching was conducted using BLASTn program on NCBI server. To characterize the nepenthesins II, DNA sequences of the Nep2 genes were translated into amino acid sequences using Expasy translate tool and followed by computation of isoelectric point (pI) of each enzyme on the Expasy server. The pI is the pH at which a certain molecule or surface holds no net electrical charge [27]. Other characterization of nepenthesin II including determination of prepro-nepenthesin II form, along with their active sites, cystein residues, N-glycosylation sites, acid and basic residues, and the NAP-specific insertion, that conducted by comparison with nepenthesin II from N. gracilis [13].

Test of positive selection on Nep2 gene
Testing for positive selection on Nep2 gene by computing the average number of synonymous and nonsynonymous substitutions were performed with MEGA 5 [28] by using codon-based Z-test of selection based on Nei-Gojobori method [29] for sequence pairs, which involved 29 nucleotide sequences of the Nep2 genes. All ambiguous positions were removed from each sequence pair. The probability computed must be <0.05 for null hypothesis (strict neutrality) rejection at 5% level, where the number of synonymous substitutions per synonymous site (d S ) and the number of nonsynonymous substitutions per nonsynonymous site (d N ) are not the same, and the alternative hypothesis is d N >d S , indicated positive selection. The difference in synonymous and non-synonymous substitutions should be significant at the 5% level. For estimating the variance of the difference between d N and d S , the bootstrap method was conducted with 1000 replications.

Nep2 genomic DNA sequences
In the present study, we have succeeded to amplify for the first time the genomic DNA sequences of APs Nep2 gene for all species examined. All of the amplified DNA sequences are most similar to the aspartic proteinase Nep2 cDNA of N. gracilis and N. mirabilis from the Genbank,with identities between 94%−98% and 93%−98%, respectively. The Nep2 genes from 29 Nepenthes species varied in length between 1314−1317 bps, but mostly 1317 bp. Some Nep2 genes with 1314 bp in length have a deletion in their propeptide (N. khasiana) or in enzyme (N. stenophylla, N. tentaculata, and N. rajah). Based on the alignment of 29 Nep2 gene sequences together with the Nep2 cDNA sequences from N. gracilis and N. mirabilis, all the DNA sequences of the Nep2 genes show no introns (data not shown).

Nepenthesin II characterization
The prepro-form of nepenthesin II from 29 Nepenthes species, most composed of 438 amino acids, including 24 residues putative signal sequence, 55 residues putative propeptide, and 359 residues enzyme (figure 1). Nepenthesin II contain 12 cysteine residues per molecule protein, 2 active site sequence motifs: aspartic acid-threonine-glycine (D-T-G) and aspartic acid-serine-glycine (D-S-G); as well as the so-called flap tyrosine residue, assigned to residue 96 ( figure 2). In addition, nepenthesin II appear to have 22 residues of NAP-specific insertion, assigned to residue 70-91 (figure 2), except for N. stenophylla that has a deletion within its NAP-specific insertion sequence. Moreover, nepenthesin II contains 1 or 2 potential N-glycosylation sites (table 3) and nepenthesin II also contains different number of acidic (aspartic acid and glutamic acid) and basic (histidine and arginine) residues (figure 2). The number of acidic and basic residues are between 28−35 residues and 1−4 residues, respectively (table 3). Twelve cysteine (C) residues, 2 active sites (DTG and DSG), and a flap tyrosine (Y) residue are highlighted in yellow, green, and pink, respectively; Twenty-two residues of NAP-specific insertion are marked with blue line, the acidic (D and E) and basic (H and R) residues are in red and green font color, respectively; and the N-glycosylation sites (NVS, NLS, and NAS) are underlined.

Synonymous and nonsynonymous substitutions
There was a total of 438 positions in the final dataset of Nep2. genes of all sequence pairs by using codon-based Z-test of selection. It shows that the probability (P) of rejecting the null hypothesis of strict neutrality (d N = d S ) in favor of the alternative hypothesis of positive selection (d N >d S ) with values of P less than 0.05, correspond to several of all sequence pairs involving N. sumatrana. This result suggests that the evolution of the Nep2 gene from N. sumatrana has been under positive selection.  N. naga, N. spathulata, N. stenophylla, N. glabrata, N. densiflora, N. lingulata, N.  platychila, N. diatas, and N. ventricosa, were positive, but the corresponding P values were above 0.05. This result suggests that the evolution of the Nep2 genes of that 10 Nepenthes species have been under strong purifying selection.

Structural and functional relationships of the Nep2 gene
In the present study, it is revealed that the Nep2 genes from 29 Nepenthes species have no introns. This is the first report presenting structural feature of nepenthesin-encoding gene from Nepenthes species. This structural feature may have relationship with the function of the gene. Genes whose expression levels are changed promptly in response to environmental stress have significantly lower intron densities in some eukaryotes. In addition, introns could slow down the regulatory responses and were selected in genes whose transcripts need fast adjustment for survival from environmental stress [30].
In some cases, transcription occurs at 1200-1500 nucleotides per minute [31] with half-lives for splicing reactions are less than 1 minute for the first intron, but 2−8 minutes for the subsequent introns [30,32]. Consequently, splicing of 2 or more introns takes longer time than the transcription itself [30]. Therefore, intron-less Nep2 gene will produce its protein rapidly for digesting the trapped prey. This rapid production of nepenthesin II enzyme may help to avoid putrefaction of trapped prey, which resulted in an accumulation of ammonium that may harm the pitcher to die [4]. This result is concordant to the result of immuno-histochemical staining of Nepenthes pitcher tissue, which indicated that nepenthesins were directly secreted into the pitcher fluid and functioned without accumulation in the pitcher tissue [13]. The rapid production of nepenthesin II enzyme is also corroborated by the small quantity of fluid contained in newly opened pitchers, which usually less than 1/6 of the total volume of the mature one [4]. Thus, Nep2 gene is supposed to have adapted specifically to produce extracellular nepenthesin II digestive enzymes rapidly by removing its introns during the process of molecular evolution.
As an extracellular proteinase, nepenthesin II of the genus Nepenthes is synthesized in the endoplasmic reticulum (ER), travel to Golgi apparatus and then to plasma membrane for secretion. This route is known as the secretory pathway. The signal sequence of nepenthesin II is recognized by specific cellular components that facilitate the proper routing of that protein. As synthesized in the ER and secreted via Golgi apparatus and plasma membrane, the signal sequence of nepenthesin II is included in the ER signal type, which is usually located near the amino terminus [33]. As an ER signal type, the signal sequence of nepenthesin II is composed of mostly (67%) nonpolar amino acids: methionine (M), alanine (A), valine (V), glycine (G), leucine (L), isoleucine (I), and proline (P) (figure 3).
Based on the alignment of amino acid residues of nepenthesin II signal sequences from 29 Nepenthes species, there are some substitutions within the signal sequences, including valine (V) to leucine (L) and alanine (A) to glycine (G) (N. khasiana), valine (V) to alanine (A) (N. longifolia), leucine (L) to glycine (G) (N. tobaica), leucine (L) to valine (V) (N. papuana), and glycine (G) to alanine (A) (N. sanguinea) (figure 3). However, those substitutions of amino acids were expected to have no change of the protein properties, since the substituted amino acids are also nonpolar.
All the enzyme of nepenthesin II examined contain 12 cysteine residues, which would form 6 disulphide bonds expected to contribute greatly to the stability of the enzyme [13]. Moreover, the stabilized structure with 6 disulphide bonds allows the protein to be resistant to protease degradation [34]. These structures suggest that nepenthesin II enzyme can remains in the pitcher fluid without digestion [14], which indicated by the 85% of the original activity after 30 days at pH 3 [13].  All nepenthesin II of 29 Nepenthes species contain an approximately 22 residues of NAP-specific insertion, preceding the flap tyrosine residue (figure 2). This insertion contains 4 cysteine residues as well as 4 acidic residues, except for N. sanguinea, N. khasiana, N. campanulata, N. faizaliana, and N. spathulate that have only 3 acidic residues. In addition, N. copelandii and N. bellii have only 2 acidic residues within their NAP-specific insertion. Overall, the differences of acidic and basic residues number would determine the pI of the enzymes which vary among nepenthesins II, with the highest point reach pH 3.45 (N. khasiana) and the lowest one at pH 2.95 (N. thai) (table 3). The sequences of NAP-specific insertion of nepenthesins II are not conserved, since some substitutions appeared within the sequence insertions of some Nepenthes species (figure 2). On the contrary, the flap tyrosine residues following the NAP-specific insertions, as well as the 2 active site motifs: aspartic acidthreonine-glycine (D-T-G) and aspartic acid-serine-glycine (D-S-G), are conserved among the 29 nepenthesin II enzymes (figure 2).
During the synthesis of prepro-nepenthesin II in the ER and following the travel to Golgi apparatus, the enzyme is attached with carbohydrate in the process of glycosylation. Since nepenthesin II has the N-glycosylation motif site(s), the carbohydrate, which is oligosaccharide chains, is attached to a nitrogen of asparagine (N) side chains, in the sequence motif of asparagine-leucine-serine (N-L-S) and asparagine-valine-serine (N-V-S), within the nepenthesin II sequence of most Nepenthes species. In addition, the sequence motif of asparagine-alanine-serine (N-A-S) [33], which resulted from the  10 substitution of valine (V) to alanine (A), could act as an N-glycosylation site within the nepenthesin II sequence from N. tentaculata (figure 2).

Adaptive evolution of nepenthesin II enzyme
The present study is the first study that concern to the adaptive evolution on Nep2 gene by using statistical analysis based on the relative abundance of synonymous and nonsynonymous substitutions. Adaptive evolution after gene duplication has been reported in several gene families [34,35]. In this study, Nep2 gene, a member of a gene family of aspartic proteinase, suggestively experienced significant positive selection on N. sumatrana (table 4), a species inhabits the lowest altitude habitats (0-800 m) amongst Sumatran endemic species, which most of them are highland species [4]. Adaptation to lowland habitats in Sumatra would be influenced by multiple physiological factors and genetic factors. For instance, at the physiological level, an obvious adaptive phenotype is the development of 2 distinct types of lower pitchers of N. sumatrana. Lower pitchers of the first type are beared from seedlings and juvenile plants with wholly or partially ovate form. While, lower pitchers of the second type are produced by a basal offshoot developing from the rootstock and have many squat lower pitcher forms than the first type [4].
The function of both types of lower pitchers are to trap creeping insects [4]. Most of the nitrogen sources of some Nepenthes species inhabit lowland habitats were provided by ants [12,[36][37][38]. In our previous study [39], N. sumatrana was in the same subclade with 13 other Sumatran endemic species and had the closest relationship with N. spathulata and N. tobaica. N. spathulata is known as a highland species, and N. tobaica has distribution area from lowland to highland (table 1). Interestingly, both N. spathulata and N. tobaica have narrow peristomes on their upper pitchers, whereas N. sumatrana has broad peristome [4]. This broad peristome on upper pitchers of N. sumatrana may help the species to get more nutrients from flying insect than the two closest species, since it has wider slippery surface [6]. Therefore, the 2 distinct types of lower pitchers of N. sumatrana and the broad peristome of their upper pitchers are mirrored by the strategy employed to obtain nutrients from the trapped prey, and the abundance of nutrient uptake should be correlated to the fitness of the species in the lowland habitats. The 2 distinct types of lower pitchers of N. sumatrana are not found among other lowland Nepenthes species as well as the highland species [4]. These 2 distinct types of lower pitchers reveal that N. sumatrana has developed specific adaptation in response to nutrient stress that characterize the habitat where it grows. Thus, they may demonstrate that N. sumatrana is under selective pressure of prey and environment.

Conclusion
Molecular and adaptive evolutions of Nep2 gene characterized Nepenthes as highly adaptable plants that actively respond to the environmental conditions and availability of prey in their habitats.