RNA EDITING IN Calotropis procera MITOCHONDRIAL NADH-DEHYDROGENASE SUBUNIT 3 GENE

NA editing refers to posttranscriptional alterations of RNA molecules through insertion, deletion, or modification of nucleotides, not including RNA splicing, capping, or polyadenylation (Nishikura, 2006; Farajollahi and Maas, 2010). RNA Editing was discovered for the first time in trypanosome mitochondria (Benne et al., 1986). RNA editing occurred as differences between genomic sequences and the corresponding RNA sequences. The predominant type of RNA editing in animals is the conversion of adenosine (A) to inosine (I), catalyzed by a family of adenosine deaminases that act on RNA (Nishikura, 2006). This editing is also known as A-to-G editing because inosine in RNA is read as guanosine (G) by the translational machinery (Nishikura, 2006). Another well-documented type of RNA editing in animals is cytidine-touridine (C-to-U) editing, catalyzed by the activation-induced cytidine deaminase/ apolipoprotein B mRNA-editing enzyme, catalytic polypeptide-like family of deaminase, but it is less frequent than Ato-G editing (Nishikura, 2006). In land plants, RNA editing highly specifically converts cytidine to uridine nucleotides in transcripts of both plastid and mitochondrial genes (Castandet and Araya, 2011); 34 cytidine residues in plastids and more than 500 residues in mitochondria have been reported to be editing target sites in Arabidopsis thaliana (Chateigner-Boutin and Small, 2007; Bentolila et al., 2008). Analysis of RNA editing in higher plant mitochondrial transcripts specifying the cytochrome b (cytb), subunit 1 of the NADH-dehydrogenase (nadl) and cytochrome oxidase subunits II and HI (coxll and coxIll) had revealed homogeneously edited cDNAs for these loci (Hiesel et al., 1989). RNA editing the nad3 locus predominantly involves modification of cytidines to be recognized as uridines by the reverse transcriptase and presumably the ribosome (Schuster et al., 1990). One reverse alteration has been observed in the cytochrome b locus modifying a genomic encoded T to C in the cDNA sequence (Hiesel, et al., 1989).


Department of Biological Sciences, Faculty of Science, King Abdulaziz University (KAU), PO Box 80141, Jeddah 21589, Saudi Arabia b Plant Molecular Biology, Giza, Egypt
NA editing refers to posttranscriptional alterations of RNA molecules through insertion, deletion, or modification of nucleotides, not including RNA splicing, capping, or polyadenylation (Nishikura, 2006;Farajollahi and Maas, 2010). RNA Editing was discovered for the first time in trypanosome mitochondria (Benne et al., 1986). RNA editing occurred as differences between genomic sequences and the corresponding RNA sequences. The predominant type of RNA editing in animals is the conversion of adenosine (A) to inosine (I), catalyzed by a family of adenosine deaminases that act on RNA (Nishikura, 2006). This editing is also known as A-to-G editing because inosine in RNA is read as guanosine (G) by the translational machinery (Nishikura, 2006). Another well-documented type of RNA editing in animals is cytidine-touridine (C-to-U) editing, catalyzed by the activation-induced cytidine deaminase/ apolipoprotein B mRNA-editing enzyme, catalytic polypeptide-like family of deaminase, but it is less frequent than Ato-G editing (Nishikura, 2006). In land plants, RNA editing highly specifically converts cytidine to uridine nucleotides in transcripts of both plastid and mitochon-drial genes (Castandet and Araya, 2011); 34 cytidine residues in plastids and more than 500 residues in mitochondria have been reported to be editing target sites in Arabidopsis thaliana (Chateigner-Boutin and Small, 2007;Bentolila et al., 2008). Analysis of RNA editing in higher plant mitochondrial transcripts specifying the cytochrome b (cytb), subunit 1 of the NADH-dehydrogenase (nadl) and cytochrome oxidase subunits II and HI (coxll and coxIll) had revealed homogeneously edited cDNAs for these loci (Hiesel et al., 1989). RNA editing the nad3 locus predominantly involves modification of cytidines to be recognized as uridines by the reverse transcriptase and presumably the ribosome (Schuster et al., 1990). One reverse alteration has been observed in the cytochrome b locus modifying a genomic encoded T to C in the cDNA sequence (Hiesel, et al., 1989).
A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region of Oenothera mitochondria (Schuster et al., 1990) and 15 sites Carthamus tinctorius (Kalinati et al., 2008). The role of nad3 editing in drought tolerance was investigated (Yuan and Liu, 2012).
Calotropis procera is flowering plant in the poison family, Apocynaceae, natively grown in North Africa, Tropical Africa, Western Asia, South Asia, and Indochina (Aiton, 2010). Calitropis species show high grown performance during the dry season, implying the occurrence of special strategies of drought tolerance (Colombo et al., 2007;Khan et al., 2007;Boutraa, 2010).
In our study, nad3 gene was identified from genomic DNA (accession no. KP171516) and cDNA (accession no. KP171517) in desert plant Calotropis procera, and then RNA editing was investigated in 11 positions of this mitochondrial gene lead to change 11 amino acid in peptide sequence.

Sample collection and isolation of total RNA and DNA
Three leaf discs of C. procera were collected from Jeddah region (KSA, latitude 2126'6.00, longitude 3928'3.00. Samples were frozen in liquid nitrogen (50 mg tissue each) and total RNA extraction was performed using RNeasy Plant Mini Kit (Qiagen,cat. no. 74903). To remove DNA contaminants, 3 μl of 10 mg/ml RNase A, DNase and protease-free (Thermo Scientific cat no. EN0531) were added to the RNA samples and tube was incubated at 30C for 15 min. DNeasy Plant Mini Kit (Qiagen, cat. no. 69106) was used for DNA isolation Estimation of the DNA and RNA concentration in different samples was done by measuring optical density at 260 nm. DNA and RNA samples were sent to Beijing Genomics Institute (BGI), Shenzhen, China, for deep sequencing and dataset were provided for analysis.

Next-Generation Sequencing (NGS)
Whole-RNA-seq and DNA-seq, paired-end short-sequence reads of C. procera were generated using the Illumina Genome AnalyserIIx (GAIIx) according to manufacturer's instructions (Illumina, San Diego, CA).

Sequence filtering and bioinformatics analysis
The raw sequencing data were obtained using the Illumina python pipeline v. 1.3. For the obtained libraries, only high quality reads (quality >20) were retained. Then, reference assembly using Rhazya stricta mitochondrion DNA (accession No. KJ485850) as a reference DNA of the obtained short (paired-end) read dataset was performed using assembler CLC Genomics workbench 3.6.5.
Ten nad3 sequences (Table 1) belonging to other plant species were obtained from GenBank and used as a reference for blasting (http://www.ncbi.nlm. nih.gov/BLAST). To produce nad3 cDNA, genomic nad3 was used as a refer-ence for raw RNA sequencing data (Illumina python pipeline v. 1.3).

Analysis of RNA editing and deduced amino acids
The genomic and cDNA sequences of nad3 transcripts of Calotropis procera obtained in the present study were analyzed for RNA editing status using multi sequence alignment using CLC genomic work bench 3.6.5 (http://www.clcbio.com/ products/clc-genomics-workbench). Also, protein multi sequence alignment was achieved using the same program

Accession Numbers
Sequence data from this article have been submitted to GenBank data library under accession numbers; C. procera genomic nad3 gene (accession no.

RESULTS AND DISCUSSION
Nad3 is a subunit of complex I of the electron transport chain in mitochondria. Interruption in nad3 editing lead to accumulate large concentrations of ROS which leads to the deterioration afford to drought in Arabidopsis (Yuan and Liu, 2012). So we will try through this study to understand RNA editing of nad3 in desert plant.

Characterization of C. procera nad3 gene
Through this study, Nad3 gene was characterized in C. procera (accession no. KP171516) using DNAseq raw data. A total of 71,349,934 paird-end short DNA sequence reads was generated for C. procera using the HiSeq 2000 Illumina platform (Illumina, San Diego, CA). Nad3 gene of Rhazya stricta (KJ485850) was used as reference in CLC genomic workbench. The best BLAST search hits were used to perform multi-sequence alignment (Table 1). This resulted in 10 nad3 gene sequences from 10 different species, in addition to C. procera. A multiple sequence alignment of the 11 sequences was obtained (Fig. 1). Many investigators were used CLC genomic workbench to perform genome sequencing and characterize genes in different bio-systems (Christopher et al., 2011;Cerna et al., 2014;Courtney et al., 2014).

Characterization of C. procera nad3 mRNA
cDNA nad3 gene in C. procera (accession no. KP171517) was characterized using RNAseq raw data. A total of 215, 841 and 902 paird-end short RNA sequence reads was generated for C. Procera using the HiSeq 2000 Illumina platform (Illumina, San Diego, CA). Nad3 gene of C. procera (accession no. KP171516) was used as a reference in CLC genomic workbench program. Investigators used traditional methods to isolate and identify the cDNA, which depend on using 9 to 10 clones to confirm the right sequences (Hiesel et al., 1989;Schuster et al., 1990;Kalinati et al., 2008). On other hand, Anders and Huber (2010) used NGS data which contain millions of reads to confirm the right sequences depending on CLC genomic workbench program (http://www.clcbio.com/products/clcgenomics -workbench).
Editing is revealed in 11 sites (nucleotide no. 44, 62, 80, 209, 215,230, 247, 266, 275, 317 and 349). All of which were C to U conversion. Total of 11 amino acid substitution were detected due to editing, the most common being proline to leucine (P-L). Other changes were serine to leucine (S-L), serine to phenylalanine (S-F), proline to serine and arginine to tryptophan (R-W) ( Table 2 and Fig. 2). Generally in Arabidopsis mitochondria, RNA editing is increase the proportion of hydrophobic amino acid codons (Giege´ and Brennicke, 1999). So it is suggested that increasing protein hydrophopicity is suitable to protein and enzyme function in mitochondrial membrane like nad3 protein (Kalinati et al., 2008). The interruption of C250 editing (cytosine base No. 250 in nad3) lead to accumulate large concentrations of ROS Which leads to the deterioration of drought in Arabidopsis (Yuan and Liu, 2012). Although Calotropis procera is a desert plant, but there is no editing in C250. By check the edited amino acid in this site, proline edited to serine in Arabidopsis, rice and sorghum (Yuan and Liu, 2012), but in Calotropis procera, leucine is not edited. Several Investigators reported that it is normal and necessary the presence of serine or leucine in protein binding or recognition sites but proline is not normal in previous sites (Matthew et al., 2003). So we suggest that C. procera nad3 does not need to be edited in this site but another nad3 gene in other species which have proline in the same position may need to be edited in order for nad3 does not lose it′s activity. Partial RNA editing (some transcripts of the same gene edit in certain sites and other not) was found in mitochondria of some plant species (Kalinaty et al., 2008), and other as well us not found this phenomena (Rurek et al., 2001). We suggest this heterogeneity is occurred according to RNA editing mechanism, which not exactly identified in plant till now (Aleel, 2011). Also, we excluded the effect of mtDNA copy on heterogeneous RNA editing because it is need different genomic nad3 sequences but investigators found that all clones of genomic nad3 gene of the same plant species have the same sequence, but the heterogenisity found in cDNA clones (Lu and Hanson, 1996;Kalinaty et al., 2008).

Analysis of the deduced protein sequence
Editing is only intermediate stage in the process of forming functional protein (Kalinati et al., 2008). The actual effect of editing needs to be assessed at the protein level. A comparison of amino acid sequences derived from genomic as well as cDNA of C. procera along with cDNA of derived amino acid profile of other species was achieved to clearly that editing in this gene of C. procera led to formation of conserved amino acid (Fig. 3).

Conserved domain analysis
Many investigators used to confirm the functionality of proteins (Copley et al., 2002;Ramadan et al., 2012;Shokry et al., 2014). Domain analysis indicated the presence of NADH-ubiquinone/ oxidoreductase, chain 3 (nad3). Conserved domain database accession number cl00535, and pfam accession number PF00507 (Fig. 4). Although conserved domain analysis of protein is classifying protein into families and predicting functional sites but this method cannot detect the activity difference between editing gene and it's original sequences because it depends on peptide sequence rather than amino acids properties. But the laboratory experiments proofed that interruption in nad3 editing results in the loss of its function (Yuan and Liu, 2012) In conclusion, extensive editing takes place in transcript of nad3 of C. procera and these edit sites are mostly conserved across plant species. This high degree of conservation in length and composition across plant species, as a result of nad3 editing, indicates to the importance for editing. It seems that RNA editing minimizes the differences between sequences on protein level; in addition to maintain a conserved polypeptide sequence for this gene.

SUMMARY
Nad3 (NADH-dehydrogenase subunit 3) gene from genomic (accession no. KP171516) and cDNA (accession no. KP171517) was identified in desert plant Calotropis procera using RNA seq and DNA seq data. A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria. The nucleotide modifications were found at 11 different nucleotide positions (nucleotide no. 44, 62, 80, 209, 215,230, 247, 266, 275, 317 and 349) within the nad3 coding region. Heterogeneous RNA editing in C. procera nad3 RNA was not detected in this study. These alterations in the mRNA sequence change codon identities to specify 11 amino acids. The alteration in nucleotides leads to codons alteration specifying different amino acids, the common being proline to leuciene (P-L). Other changes were serine to leucine (S-L), serine to phenylalanine (S-F), proline to serine and arginine to tryptophan (R-W). These alterations are common in mitochondrial nad3 gene of most plant species with few differences according to the properties of the amino acids involved. genomic nad3 sequence. Dots indicate to similarity to C. procera genomic nad3 sequence. Fig. (2): A comparison between mt.genomic and cDNA sequences of C. procera nad3. The corresponding amino acids are given in the second and fourth lines respectively. Dots indicate to similarity between genomic and cDNA sequences C. procera nad3 gene.