Characterization of gibberellin 2-oxidase isoforms in coconut ( Cocos nucifera L.)

Gibberellins (GAs) are plant hormones that are essential for many developmental processes in plants, including seed germination, stem elongation, leaf expansion, trichome development, pollen maturation and the induction of flowering. Gibberellin 2-oxidase (GA2-ox) regulates plant growth by inactivating endogenous bioactive GAs through 2 β -hydroxylation. There is no information about GA2-ox encoding genes or their functions in coconut. In this study, we have identified 10 transcripts encoding different isoforms of GA2-ox from coconut leaf transcriptome data. Sequence comparison and phylogenetic analysis revealed that these 10 transcripts represented different types of GA2-ox. The secondary structure, three dimensional structure and active sites of these 10 isoforms were predicted. Docking studies of different active GAs with these isoforms was also carried out.


Introduction
Gibberellic acid (GA) is a plant hormone which plays an important role in many aspects of plant growth and development, such as seed germination, stem elongation and flower development (Yamaguchi and Kamiya, 2000;Hedden and Thomas, 2012;Davière and Achard, 2013;Gupta and Chakrabarty, 2013). GA biosynthesis is regulated by developmental, hormonal and environmental stimuli and affects various biological processes by controlling the active GAs (Harberd et al., 1998). The identification of most of the genes involved in the metabolic pathways for gibberellin hormones has helped in understanding these pathways and their regulation. Many of these enzymes are multifunctional (Yamaguchi, 2008). Genetic manipulation of GA metabolism can dramatically influence crop yield (Sakamoto et al., 2004). The GA 20-oxidase (GA20-ox), GA3oxidase (GA3-ox) and GA 2-oxidase (GA2-ox) are three enzymes which catalyze later reactions in the GA biosynthesis pathway. These enzymes belong to the 2OG-Fe (II) oxygenase superfamily and are each encoded by a multigene family (Hedden and *Corresponding Author: rajesh.mk@icar.gov.in Characterization of gibberellin 2-oxidase isoforms in coconut (Cocos nucifera L.) Phillips, 2000) and are regulated differently, adding unexpected genetic complexity (Lo et al., 2008).
The GA activity positively regulates the expression of the GA2-ox genes, which inactivate GAs by GA 2-β-hydroxylation (Hedden and Phillips, 2000). Such feedback and feed forward regulation maintains the level of bioactive GAs and is the basis for GA homeostasis (Yamaguchi and Kamiya, 2000). Most of the non-bioactive GAs in plants exists as precursors for the bioactive forms or as deactivated metabolites. The concentration of bioactive GAs in plant cells are tightly maintained by the balance between GA biosynthesis and deactivation. High concentrations of GAs tend to repress the expression of GA20-ox genes that promote the production of bioactive GAs, but stimulate the expression of GA2-ox genes, which deactivate GAs through 2-β-hydroxylation (Sakamoto et al., 2004;Yamaguchi, 2008). Overexpression of GA2-ox induces typical GA-deficient phenotypes in higher plants, such as dwarfism and small dark green leaves (Schomburg et al., 2003). The physiological functions of GA2-ox genes have been studied in model plant species such as Arabidopsis and rice (Schomburg et al., 2003;Rieu et al., 2008).
Isozymes are variants of an enzyme with the same function that are found in the same individual (Hunter and Market, 1957). These enzymes may have different kinetic rates, different regulatory properties, or be expressed in a tissue-specific manner. In rice, 10 genes coding for different GA2-ox isozymes have been identified (named as OsGA2-ox1 to OsGA2ox-10) (Lo et al., 2008;Yamaguchi, 2008). The increased expression of rice C20 GA2-oxs generally leads to semi-dwarfism with little or no influence on yield level, and may thus represent a useful approach for manipulating plant height to raise yield potential (Lo et al., 2008). C20-GA2-ox proteins have been characterized in Arabidopsis, spinach and cucumber (Schomburg et al., 2003;Lee and Zeevaart, 2005;Pimenta Lange et al., 2013). In many plant species, the GA20-ox, GA3-ox, and GA2-ox functions are carried out by enzymes encoded by small gene families (Phillips et al., 1995;Thomas et al., 1999;Sakamoto et al., 2004;Han and Zhu, 2011), which account for both functional redundancy and tissue specificity (Mitchum et al., 2006).
Coconut (Cocos nucifera L.), an important plantation crop which belongs to family Arecaceae, gives a wide range of products for human use and is considered as "the symbol of tropics''. It plays an important role in economy of tropical countries. However, coconut cultivation is presently confronted with a relative decline in many countries due to the explosive competition from other oil crops, the increased demand of timber, drought, pest, disease and low fertility of the soils. Furthermore, the slow growth and long pre-breeding period of palm inhibit the genetic enhancement of coconut palm for productivity and tolerance to biotic and abiotic stresses . Coconut palms have been classified into talls and dwarfs based on plant stature and earliness . The traditional commercial coconuts were the tall varieties which were preferred above the dwarf varieties because of the quality and quantity of copra they produce (Woodroof, 1979). They normally live for over 60 years, are adaptable to a wide range of soil conditions, resistant to diseases and water stress, and start to bear within six to ten years. The dwarf varieties come into bearing within three to four years, attain full production by the ninth year and have a life span of about 30 to 40 years. The dwarf varieties exhibit greater resistance than the talls to diseases, including lethal yellowing and root (wilt) disease (Been, 1981;Nair et al., 2004).
Studying dwarfing genes assumes important in coconut in the current scenario of preference of dwarf cultivars and hybrid between tall and dwarf palms because of scarcity of climbers to harvest coconuts or carrying out plant protection measures in tall palms (Shafeeq et al., 2015). Thus, manipulation of the GA metabolic pathway by over-expression of the GA2ox gene may be an effective means for modifying plant height. Recently, RNA-Seq of leaf transcriptome of Chowghat Green Dwarf (CGD) has been reported (Rajesh et al., 2013. Also, Shafeeq et al. (2015) have reported reconstruction of GA biosynthesis in coconut using the leaf transcriptome data. Coconut genome is yet to be sequenced and thus transcriptome analysis can be useful resource for gene discovery. In this study, we have utilized the data derived from RNA-Seq on an Illumina HiSeq 2000 platform and de novo assembly of leaf transcriptome of dwarf coconut CGD cultivar. We have identified 10 isoforms of GA2-ox gene from the transcriptome data based on protein sequence comparisons. We have also used the well-established KEGG annotation system of Arabidopsis and rice to analyze the functions of the 10 GA2-ox isoforms. Motif and structure prediction was carried out and in silico molecular interaction studies were carried out using GA catabolites.

Data source
For the current study, coconut leaf transcriptome data (SRX 436961) was used to explore the GA biosynthetic pathway in coconut.

Annotation and pathway analysis
We have carried out annotation of the gibberellic acid biosynthetic pathway genes by comparative genomics approach. GA biosynthetic pathway genes with known functions were identified from the literature and corresponding protein sequences from model organisms along with the metabolic steps was retrieved from Uniprot (http://www.uniprot.org/), Genbank (http://www.ncbi.nlm.nih.gov/genbank/) and KEGG (http://www.genome.jp/kegg/pathway.html) databases. Enzymes of interest were identified from Arabidopsis thaliana, and monocots like rice (Oryza sativa), maize (Zea mays), wheat (Triticum urartua), date palm (Phoenix dactylifera) and oil palm (Elaeis guineensis). The protein sequences were subjected to the construction of phylogenetic tree in MEGA (Tamura et al., 2007) with 1000 bootstrap replicates.

Comparative genomics and functional annotation
Sequences of GA2-ox from date palm (Phoenix dactylifera) and oil palm (Elaeis guineensis) were aligned with the coconut leaf transcriptome data using the TBLASTN (Altschul et al., 1997) program and HMMER based search (Finn et al., 2011), with the critical values of the alignment were set as E-value ≤1e-10, identity percentage (indicating the similarity between the aligned sequences with respect to the length of the matched region) and coverage percentage (indicating the similarity between the aligned sequences with respect to the size of the query's sequence). Only the sequences which possessed major BLAST similarity were further analyzed. Predicted genes were subjected to gene ontology (BLAST2GO) and family domain analysis [ESTscan (Iseli et al., 1999)] and SMART (Letunic et al., 2012).

Isoform identification
The retrieved GA2-ox transcripts were pooled out from the transcriptome nucleotide sequences and were translated to protein sequences using ExPASy. The translated sequences were subjected to BLASTp search against Arabidopsis and GA2-ox isoforms were identified.

Motif identification, structure prediction and docking studies
The programme MEME (Bailey et al., 2009) was used for the recognition of motifs in GA2-ox isoforms. MEME was run from the web server with the parameters: for each motif, the minimum width was six amino acids and the maximum width was 50 amino acids. The maximum number of motifs was set at 15. The consensus motifs were obtained using MAST programme (Bailey et al., 2009). Three-dimensional structures of protein isoforms were performed by homology modeling and threading method using Phyre2 server (http://www.sbg.bio.ic.ac.uk/phyre2/). The predicted structures were validated using Ramachandran plot. Further processing of protein structure was carried out by "protein preparation wizard" Maestro (Schrödinger, Version 10.1.012). Molecular docking studies were carried out by using 'Glide module' in Schrödinger molecular modeling environment. The structures of GA substrates were obtained from PubChem (https:// pubchem.ncbi.nlm.nih.gov/).

Results and discussion
Prior to this work, virtually no information was available about the genes encoding GA2-ox in coconut, although orthologous genes have been intensively studied in model plant species such as Arabidopsis and rice (Schomburg et al., 2003;Lo et al., 2008;Yamaguchi, 2008). These GA2-ox enzymes use C19 GAs as their substrates. The isolation and analysis of GA2-ox genes in coconut is important because it might provide useful information on the characteristics of GA2-ox genes and are potentially useful gene resources for manipulating plant height and yield level.
We have tried to annotate the GA2-ox genes along with its isoforms from the transcriptome data retrieved from leaf transcriptome (SRX 436961). After clustering and assembly, these sequences were assembled into 254,302 contigs, 159,932 scaffolds and 130,942 unigenes with sequence size of more than 100 bp. These unigenes were used for identification of GA biosynthetic genes. Thirty seven transcripts showing homology towards GA biosynthetic pathway were identified (Shafeeq et al., 2015). From the GO analysis we have searched for all the 37 transcripts and identified that all the transcripts belong to the GA biosynthesis pathway with specific functions. Contigs were annotated using BLAST2GO to assign Gene Ontology classifications. The annotated genes were compared with the KEGG pathway and its functions site of action was identified. We have identified 10 GA2-ox isoforms ranging from 107-220 amino acids. The isoforms were designated as Cocos nucifera GA2-ox (CnGA2-ox) isoforms from 1-10. For confirmation, we have carried out BLASTp analysis of the translated protein isoform sequences against Arabidopsis thaliana which gave the following results: CnGA2ox1 and CnGA2ox8 showed similarity towards gibberellin 2 oxidase 6, CnGA2ox2, CnGA2ox3, CnGA2ox4, CnGA2ox5, CnGA2ox6, CnGA2ox7 and CnGA2ox9 showed similarity towards gibberellin 2-β-deoxygenase 2 and CnGA2ox10 showed similarity to gibberellin 2-βdeoxygenase 3.

Motif analysis of the GA2-ox
When we used MEME to identify and examine the conserved motifs in all 10 GA oxidase isoforms, we found a total of 12 conserved motifs (Fig. 1). major clades (Fig. 2). The first cluster was arranged with all the gibberellin 2-β-deoxygenase 2 isoforms (with highest bootstrap support of 100%) i.e., CnGA2ox2 to CnGA2ox9 except CnGA2ox7 isoform. The CnGA2ox7 isoform was related to CnGA2ox10 isoform (with a bootstrap value 65%) which is gibberellin 2-β-deoxygenase 3. This might be due to the sequence similarity between CnGA2ox7 and CnGA2ox10. However, the second cluster was subdivided into two clusters in which one consisted of isoform CnGA2ox1 and CnGA2ox8 which was proved to be gibberellin 2 oxidase 6 (with a bootstrap value  Differences among motif distributions indicated sources of functional divergence in GA oxidases over evolutionary history (Han et al., 2011). From the phylogenetic analysis, it was clear that the GA2-ox isoforms were distributed in two of 97%) and the second cluster consisted of CnGA2ox7 and CnGA2ox10 where CnGA2ox7 was found to be gibberellin 2-β-deoxygenase 6 and CnGA2ox10 to be gibberellin 2-β-deoxygenase 3 (Fig. 2).

Analysis of 3-D structures of protein isoforms
In order to analyze the characteristics of functional motifs, we have predicted secondary structures and modeled the 3D structures of isoforms (Fig. 3). All protein isoforms belonged to double stranded α-helix fold. However, all the isoforms differed in their structural composition. According to the tertiary structure analysis, the secondary structural components of same isoforms showed similarity with each other i.e., protein isoform 1 with 220 amino acids possessed six helices whereas protein 8 with 202 amino acids showed five helices, both showed same strands and turns i.e., 4 and 11 respectively. In the same way, protein isoform 3 and 4 showed same amino acid length of 207 and showed 5 and 7 helices with same strands and turns of 6 and 23 respectively. However, protein isoforms 2 to 9 even though same isoforms, but showed different helices, turns and strands which might be because of their different amino acid lengths. Isoform 10 consisted of 120 amino acids with 3 helices, 11 strands and 10 turns. The Ramachandran plot for amino acid distribution in most favoured region was 93.0, 74., 80.2, 83.2, 82.1, . According to the validation, some isoforms did not show enough quality for a good structure, but after prime energy minimization it attained structural excellence.

CASTp
After CASTp analysis, we have identified most favoured binding pockets of all the 10 GA2-ox isoforms. Three best binding pockets were selected and are given in three colors (Fig. 4).
Molecular docking was carried out using Glide module in Schrödinger software suite. Isoforms with higher sequence length was selected for molecular docking and the results are presented in Table 1. The catabolites for docking analysis were selected accordingly from the KEGG pathway. The result indicates that gibberellin 2 oxidase 6 had good affinity with GA 34 with -5.85 XP Glide Score (-30.87 Kcal mol -1 ) which contains three hydrogen bond whereas gibberellin 2 oxidase 6 isoform shows affinity towards GA 29 with -5.42 XP Glide Score (-34.74 Kcal mol -1 ). Gibberellin 2 oxidase 6 also showed good affinity with GA 51 with -5.40 XP Glide Score (-32.78 Kcal mol -1 ). This contains two hydrogen bonds. It shows that gibberellin 2 oxidase 6 and GA 34 had greater affinity than GA 34 and GA 51 ( Fig. 5 and 6).
Among the gibberellic acid genes, GA2-ox functions to metabolize bioactive GAs (GA 1 and GA 4 ) and their immediate precursors (GA 9 , GA 12 , GA 20 and GA 53 ) to inactive GAs by hydroxylating their 2-carbon position. Genes encoding GA2-ox have been isolated and characterized in various plant species (Lester et al., 1999;Hedden and Phillips, 2000). In Arabidopsis thaliana and Nicotiana tabacum, overexpression of AtGA2-ox7 or AtGA2-ox8 induced severely dwarf phenotypes (Schomburg et al., 2003). On the other hand, a mutant for the SLENDER gene of Pisum sativum, which encodes GA2-ox, exhibited increased plant height and accumulation of high levels of GA precursors in seeds (Martin et al., 1999). Hence, manipulation of the GA metabolic pathway by overexpression of the GA2-ox gene may be an effective means for modifying plant height. GA2-ox genes play a major role in deactivation of GA metabolism. It has been frequently observed that GA2-ox expression is generally low in the plant organs that have active mitotic division and undergo rapid growth (Schomburg et al., 2003;Lo et al., 2008;Rieu et al., 2008). This is probably due to the requirement of maintaining a higher pool of bioactive GAs for supporting the active organ growth. When excess GAs is encountered, GA2-ox expression may be increased for degrading such compounds in order to maintain normal GA homeostasis. This phenomenon has also been observed in a previous work (Thomas et al., 1999). Thus from the molecular docking studies, it was clear that the substrates in the pathway are really important for the expression pattern of GA2-ox where over expression of GA2-ox causes dwarf mutants. To conclude, it is really important to maintain an optimum GA2-ox activity which regulates the bioactive GA levels along with its isoforms. This work throws light towards the GA2-ox isoforms in coconut along with the catabolite interactions.