Molecular characterization, modeling, in silico analysis of equine pituitary gonadotropin alpha subunit and docking interaction studies with ganirelix.

Equine pituitary gonadotropins (eLH, eFSH, eCG) are heterodimeric glycoprotein hormones with alpha (α) and beta (β) subunits. It is responsible for maintenance of pregnancy in mares during early gestation and fairly valuable for inducing superovulation in animals other than equines. The alpha subunit is common, while beta subunit is species-specific in all glycoprotein hormones. In the present investigation, molecular cloning and in silico characterization including homology modeling and molecular docking analysis of the equine chorionic gonadotropin (eCG) alpha subunit was carried out for gaining structural and functional insights into the eCG alpha subunit and its possible interaction with ganirelix, a gonadotropin-releasing hormone (GnRH) antagonist. The equine chorionic gonadotropin (eCG) alpha subunit expressed in pituitary gland was selected, amplified from total RNA, cloned and sequenced. The in silico analyses were made for homology modelling, structural details, epitope identification and chromosomal localization. Molecular docking studies of eCG alpha were undertaken with a drug ganirelix which is used to control ovulation and has antagonistic activity against GnRH. The protein sequence corresponding to selected open reading frame (ORF) was 99-100% similar with domesticated horse, Przewalski's horse, and 92-93% with Burchell's zebra and donkey. Molecular docking studies revealed the possible interaction of eCG alpha with ganirelix. The possible drug-macromolecule interactions were visualized between eCG alpha and ganirelix. The study will provide structural insight into unique sites and an alternate route of gonadotropin suppression applicable to assisted reproductive technologies.


Background
Reproduction is a complex but extensively studied process. Akin to other mammalian species, in equines, several physiological events coupled with reproduction are extremely coordinated, synchronized and regulated by the endocrine system, i.e. a series of communications of the hypothalamic-pituitary-gonadal factors. The pituitary gland is truly dubbed as the ''master player of endocrine orchestra'' of the living mammalian body and act as the principal endocrine regulator of reproduction, growth and metabolism as well as other environmental conditions including response to stress. Distinctive cell types in the anterior pituitary gland including somatotrophs and gonadotrophs, secrete polypeptide hormones such as prolactin, growth hormone and the gonadotropins viz. luteinizing (LH) and follicle stimulating (FSH) hormones (Kalra et al. 1997;Gregory et al. 2000). Pregnant mare serum gonadotropin (PMSG), also known as equine chorionic gonadotropin (eCG) is responsible for maintenance of pregnancy in mares during early gestation (Christakos and Bahl 1979). Besides, this hormone is fairly valuable for inducing superovulation in animals other than equines. eCG is composed of two non-similar and non-covalently associated a and b subunits (Pierce and Parsons 1981).
Within the same species, the a subunit is encoded by a single gene while different genes encode b subunits providing receptor binding specificity to the glycoprotein hormone heterodimers thereby offering the biological specificity to each hormone (Combarnous 1992). However in contrast to primates and other farm animals, in horses, both LH and CG b subunits are encoded by the same gene (Sherman et al. 1992). Both the pituitary eLH and placental eCG display the dual LH and FSH like activity in nonequine species with identical FSH/LH activity ratios (Chopineau et al. 1993). Although eCG and eLH exhibit identical a and b polypeptide chains, their carbohydrate contents are different. Undeniably, among all the glycoprotein hormones, eCG is the most heavily glycosylated with 45% carbohydrate by weight versus 30% for eLH (Bousfield et al. 1996;Hokke et al. 1994). The nature of oligosaccharide side chains plays an imperative role in the processing and biological activity of the respective hormones (Ulloa-Aguirre et al. 2001). The chorionic gonadotropin, expressed in the placenta of higher primates (Maston and Ruvolo 2002), is indispensable for positive advancement of pregnancy. CG mediates its action through the LH/CG receptor, and is responsible for preventing regression of the corpus luteum. This stimulates sustained progesterone production that upholds the uterine lining in a physiological stage receptive for implantation. A 621 base pair fragment of the cDNA for the a-subunit of human chorionic gonadotropin was isolated by cloning in a plasmid vector, and the complete nucleotide sequence was determined (Fiddes and Goodman 1979). They also reported that the four human glycoprotein hormones are encoded by common alpha subunit gene (Fiddes and Goodman 1981). Min et al. (1994) gave first report on nucleotide sequence and placental expression of eCG alpha subunit. They determined the full length of the equine a subunit cDNA sequence and investigated the expression ratio of a and b subunit mRNA in the equine placenta on day 70 of gestation. Not only in equines, but other primates and non-primates pituitary glycoprotein alpha subunits were studied, compared with other reference sequences and characterized (Abel et al. 2014;Weltzien et al. 2003;Chi et al. 2015;Chaube et al. 2015). From time to time, various reports on eCG alpha subunit have been published. However, to our knowledge, this is the first report on three dimension structure prediction of eCG alpha protein, its antigenic propensity score and its interaction with potential drug molecule which may modulate the in vivo activity of the protein. Ganirelix is a synthetic decapeptide with high antagonistic activity against naturally occurring gonadotropin-releasing hormone (GnRH). It is mainly used in assisted reproductive technology to control ovulation. Ganirelix prevents ovulation until it is triggered by injecting human chorionic gonadotropin (hCG). Ganirelix is also known to interact with hCG. Treatment with ganirelix effectively prevents premature LH rises (Lambalk et al. 2006). Therefore, it was interesting to assess the possible drug-macromolecule interactions between ganirelix and eCG alpha. The present study will pave ways and provide insight into unique sites in alpha subunit of pituitary gonadotropin that will be useful in future studies In Silico Pharmacol.
based on equine gonadotropins as well as in assisted reproductive technologies.

Animals, sampling and ethics statement
Pituitary glands were collected after natural death of mares at the organized equine farm of NRCE, Bikaner. All the procedures for sample collection were performed according to the approval by the Institute Animal Ethics Committee (IAEC) of ICAR-National Research Centre on Equines vide NRCE/CPCSEA/2012-13/dated the 18th July 2012.

RNA isolation and cDNA synthesis
RNA extraction was done with RNAqueous Kit (Ambion Inc., Austin, Texas) as per manufacturer's instructions. Total RNA was quantified and absorbance was measured at 260 and 280 nm with a UV-spectrophotometer (Nanodrop ND-1000, USA). The integrity of RNA preparation was evaluated using agarose (1%) gel electrophoresis (Tarson, India) after its denaturation with glyoxal and dimethyl sulfoxide (DMSO) (McMaster and Carmichael 1977) in 1xTAE buffer at 60 V for 30 min (EPS-Pharmacia-500/400). The High Capacity cDNA Reverse Transcription Kit (Applied Biosystems) was used for cDNA synthesis following the manufacturer's instructions.

PCR amplification of alpha subunit of equine gonadotropin
For amplification of a subunit of eCG, reported primers were selected (Galet et al. 2000) as forward-5 0 -CGGAATTCGCTCCTGATGTGCAGGATTGCC-3 0 (30 bp) and reverse-5 0 GCTCTAGACCTTTAAATCT TGTGGTGAT-3 0 (28 bp). The PCR conditions were: initial denaturation at 94°C for 4 min, cycling denaturation at 94°C for 30 s, annealing at 57°C for 45 s, extension at 72°C for 45 s, followed by 35 cycles with a final extension step at 72°C for 10 min. The PCR product (*300 bp) was checked in agarose gel electrophoresis.

Cloning of alpha eCG gene in TOPO-TA cloning vector
The TOPO-TA vector (Invitrogen, CA) was used for cloning the PCR product according to the manufacturer's instructions. The PCR product to be ligated was purified using Qiagen gel purification kit. For ligation of a-eCG gene into the TOPO-TA vector, the concentration of both the product and the vector was kept in the molar Ratio of 4:1. The well isolated blue and white colonies were observed. The white colonies were picked from the master plate and sub-cultured on LB agar plates with antibiotics such as ampicillin (100 lg/ml) and tetracycline (50 lg/ml) as well as X-gal (40 ll of 20 mg/ml concentration) and IPTG. The colonies, which were able to grow on selective antibiotics, were considered as transformed. Recombinant plasmids were isolated from E. coli DH-5 alpha cells (harbouring the plasmid a-eCG-TOPO-TA) by alkaline lysis method. For amplification of a-eCG gene colony PCR was carried out. The PCR products were analyzed in agarose gel electrophoresis along with DNA molecular weight marker.
Custom sequencing of cloned alpha-eCG gene and analysis The purified PCR product and recombinant plasmid with cloned eCG alpha gene was sent for custom sequencing to Xcleris genomics lab, Ahmadabad, India. The sequence obtained was analysed with CLC sequence viewer v 7.6.1.and similarity search was done with the reported sequences retrieved from the NCBI.
In silico analysis of alpha-eCG

Sequence retrieval and analysis
The analysis of the translated amino acid sequences of the ORF corresponding to the cDNA coding for alpha eCG (partial) was done by ORF Finder, coding potential calculator (CPC), NCBI BLAST as well as other bioinformatics tools such as CLC sequence viewer, DNASTAR Lasergene software suite version 7.1 and PSIPRED bioinformatics softwares and other tools described in the following section.  (NP_001277528), pig (NP_999611.1), mice (NP_034019.1), brushtail possum (AAC63900.1), African elephant (XP_003404412), galago (XP_003803482), tarsier (XP_008066056), marmoset (NP_001254689.1), Arabian camel (XP_010986025.1), sheep (NP_001009464.1), cattle (NP_776326.1) and goat (NP_001272613.1). Sequences were aligned with the help of CLC Sequence Viewer v7.7 (QIAGEN Aarhus A/S) and the sequence alignment was saved. The sequence alignment file was again used in the MEGA 7.0.14 program. Molecular phylogenetic analysis was performed from this data following the Maximum-Likelihood (ML) method for 1000 bootstrap replications. The evolutionary history was inferred by using the Maximum Likelihood method based on the Jones-Taylor-Thornton (JTT) matrix-based model (Kumar et al. 1994). The tree with the highest log likelihood (-530.7622) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Joining and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 27 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 78 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 (Caspermeyer 2016). To understand the genome organization of the eCG alpha gene, the chromosomal location was also determined based on the Equus caballus genomic sequence annotation from NCBI (Equus caballus isolate Twilight breed thoroughbred chromosome 10, EquCab2.0; NC_009153).

3D modeling
3D models were predicted by using Geno3D (automatic molecular modeling tool) server of PBIL-IBCP Lyon-Gerland (https://geno3d-prabi.ibcp.fr/) (Combet et al. 2002). The model generated was evaluated in terms of Ramachandran plot, main and side chain parameters and residue properties by PROCHECK (Laskowski et al. 1993), Z-score and QMEAN score. The 3D homology model was checked by Procheck and assessed by web servers such as ProSA-Web (Wiederstein and Sippl 2007) and QMEAN (Benkert et al. 2009) for model quality estimation. To assess the quality of model, QMEAN Z-score overall quality factor was used and QMEAN score for density plot was used to assess the organization of various atoms. SWISS-MODEL was also used to assess the quality and structural features of the developed model and local model quality estimation was done by ANOLEA (atomic nonlocal environment assessment) (Melo et al. 1997), GRO-MOS (Christen et al. 2005) and QMEAN6.
Evaluation of the structure as well as other stereochemical analyses were accomplished by online available Geno3D (Combet et al. 2002) and VADAR version 1.8 (http://vadar.wishartlab.com/) (Willard et al. 2003) servers. The model was visualized by using 3-D Molecule Viewer of Vector NTI Advance, Version 11.5.2 (Life Technologies) and PyMol molecular visualization software (Schrodinger 2010). The probable binding sites on the developed eCG alpha model are very important and therefore, the active Sites of the protein were analysed by MetaPocket 2.0 (http://projects.biotec.tu-dresden.de/metapocket/index. php) (Huang 2009).

Virtual screening and molecular docking
Molecular docking was executed to reveal the interactions of ganirelix towards its target protein namely the eCG alpha. PyRx-Virtual screening tool version 0.8 (pyrx.sourceforge.net, Scripps Institute) (Dallakyan and Olson 2015) was used for virtual molecular screening and docking with the lead compound ganirelix (Additional file 2: Table S1). As mentioned earlier, it is a synthetic decapeptide with high antagonistic activity against naturally occurring gonadotropin-releasing hormone (GnRH). The PDB file of the eCG alpha 3D homology model earlier developed, was used as the macromolecule. The 2D structure file was downloaded from the Pubchem open chemistry database available at NCBI in sdf/xml format. OpenBabelGUI (O'Boyle et al. 2011) was used to create all the input files in.pdbqt format and saved. Before docking, PDB file of eCG alpha 3D model was energy-minimized and prepared by removal of water molecules, addition of polar hydrogens and Gasteiger charges. To conclude, corrected PDB files were saved in.pdbqt format. AutoDock wizard was used and the ligand and macromolecule were selected. A grid box with a size of 63.93 Å (X), 42.69 Å (Y) and 25.0 Å (Z) was used. The centre of the grid box was allocated using x, y, z coordinates of -2.72236290175, 1.41345873493 and 0.6341, respectively. Molecular docking was performed using AutoDock Vina, which is a part of the PyRx 0.8 software. Docking poses of ganirelix-eCG alpha docked structures were visualized using the PyMOL v 1.8.2.0 software 32 and the In Silico Pharmacol.

Structure-based prediction of antibody epitopes
Though, the reliable prediction of epitopes is very exciting and challenging, it provides great opportunity for the design and fabrication of immunodiagnostics. Therefore, Linear and discontinuous B cell epitopes were predicted using the ElliPro: Antibody Epitope Prediction server of the IEDB analysis resource (Ponomarenko et al. 2008). The SCRATCH Protein Predictor (http://scratch.proteomics.ics. uci.edu/) was used for ascertaining the antigenicity propensity score (Cheng et al. 2005). The eCG alpha protein sequence was also analysed from a diverse selection of methods for secondary structure, hydropoathy, antigenicity, amphilicity, surface probability and flexibility by using the Protean Software (for Protein Structure Analysis and Prediction) of the DNASTAR Lasergene software suite version 7.1 (Burland 2000).

Amplification and cloning of alpha subunit of eCG
The mare's pituitary was successfully processed for total RNA isolation and cDNA synthesis. The purity of RNA preparation was determined by finding the A 260 /A 280 ratio which was found to be in the range of 1.8-2.0. The cDNA was synthesized and used for amplification of alpha-eCG of approximately 300 bp by using reported primers corresponding to equine chorionic gonadotropin hormone alpha subunit (Fig. 1a).
The sequence alignment, phylogenetic and chromosomal localization analyses of alpha subunit of eCG The sequencing of the PCR product and recombinant plasmid with cloned eCG alpha gene revealed a nucleotide sequence of 306 bp including the sequences corresponding to the primers as well as those of sites for restriction enzymes (EcoRI and XbaI) (Additional file 1: Figure S1). The ORF Finder (Open Reading Frame Finder) of NCBI was used to find the ORF corresponding to the nucleotide sequence. Three ORFs were obtained with 79, 67 and 48 amino acids excluding the primer sequences. These ORFs were subjected to protein blast at NCBI and preferred based on the maximum sequence identity. Thereafter, the corresponding amino acid sequence (79 amino acid) of alpha subunit of eCG was selected (Fig. 1b) and subjected for further in silico analysis.
The nucleotide sequences corresponding to the selected ORF showed sequence similarity of 99, 97 and 96 percent with the reported glycoprotein hormone or chorionic gonadotropin alpha sequences of Equus caballus (horse) as well as Equus przewalskii (Przewalski's Horse or Dzungarian Horse), Equus burchelli (Plains Zebra) and Equus asinus (Donkey) respectively. The translated ORF sequence was: MTSSFFKLGVPIYQCKGCCFSRA YPTPARSRKTMLVPKNITSESTCCVAKAFIRVTVM GNIKLENHTQCYCSTCYHHKI*.
Putative conserved domains characteristics of glycoprotein homone, i.e. Hormone_6 super family (actual alignment was observed with pfam00236 member of the superfamily cl02448) were detected in the NCBI Conserved Domain search. In Pfam searches also significant matches were observed with Hormone_6 family (PF00236) and CL0079 clan. The sequence obtained was compared to the reference sequences of other species available at Gen-eBank. The high similarity was observed between alpha eCG and other reported sequences (Additional file 1: Figure S2).
The alpha-subunit of eCG was found to be highly conserved among equines and other species as suggested by earlier reports (Cahoreau et al. 2015). The alpha-subunit of eCG sequence was also analyzed for putative ORF sequences and detailed RE map was generated for observing putative sites for restriction enzymes which may interfere while cloning the gene in vector. The alpha subunit of eCG is formed of 24 amino acid signal sequence and a mature protein of 96 amino acids. In the present study only the mature polypeptide was selected for amplification and based on the primers used, the amplification product coding for an ORF of 79 amino acids was used and was devoid of the signal peptide. However, further the amino acid sequences, MTSS were also left based on the conservation among sequences in various species.
As shown in the Additional file 1: Figure S2, the alpha subunit sequence was found to be about 99-100% similar with the gonadotropin alpha subunit sequences or precursors of modern domesticated horse and the endangered Przewalski's horse, and 92-93% with Burchell's zebra and the donkey. Similarly, a sequence identity of 84% (cat and lesser hedgehog tenrec), 82% (bat, Coquerel's sifaka lemur, rabbit, walrus, polar bear and dog), 81% (amur tiger, pig, mice, brushtail possum and African elephant), 80% (galago and tarsier), 79% (marmoset), 78% (Arabian camel, sheep and cattle) and 77% (goat) respectively, was observed with the reported gonadotropin alpha and precursor sequences. To visualize the phylogenetic relationship of the deduced ORF amino acid sequences of alpha subunit of eCG gene with those of other sequences known in other mammalian species, we analyzed it with other 26 sequences mentioned earlier, i.e. 5 of equid family c Phylogenetic relationship of the equine chorionic gonadotropin alpha (eCG alpha) protein sequence with other 26 reported protein sequences of different mammals including other equines. The data provided by the multiple sequence alignment patterns of amino acids were used for the molecular phylogenetic analysis. In the phylogenetic tree, the sources of all the sequence data are provided as accession numbers and species names. The phylogeny is an un-rooted tree recovered using the maximum-likelihood (ML) method based on 1000 bootstrap replications and Jones-Taylor-Thornton (JTT) matrix-based model by the MEGA 7.0.14 program. The tree with the highest log likelihood (-530.7622) is shown here. Bootstrap values greater than [0% is shown on the branch In Silico Pharmacol. members and those of cat, lesser hedgehog tenrec, bat, Sifaka lemur, rabbit, walrus, polar bear, dog, tiger, pig, mouse, brushtail possum, marmoset, camel, elephant, galago, tarsier, sheep, cattle and goat. Based on multiple sequence alignment of the alpha subunit of eCG with other sequences of these selected species, the phylogenetic tree was constructed. The result is shown in Fig. 1c.
Equus caballus genomic sequence annotation from NCBI (Equus caballus isolate Twilight breed thoroughbred chromosome 10, EquCab2.0; NC_009153) was used for determining the genome organization of the eCG alpha gene and the chromosomal location was determined. As shown in the Additional file 1: Figure S3, the eCG alpha genes were found to be distributed on chromosome 10 at 2 places based on the matches (aligned regions spanning positions from 39938737-39938896 and 39937910-39937996). Similar results were evident while using Ensemble BLAST. Searches for these regions in Ensemble displayed the CGA gene (ENSECAG00000018841) for eCG alpha at chromosome 10 and a synteny between horse chromosome 10 (position 39937900-39940063) and Human chromosome 6 (position 87085498-87095406) and dog chromosome 12 (position 46610973-46613039) (Additional file 1: Figure S4).

Generation, validation and visualization of the 3D model of the eCG alpha
Geno3D (Automatic molecular modeling tool) of PBIL-IBCP Lyon-Gerland was used for construction of the homology model of eCG alpha subunit. NCBI BLAST search of the sequence of its translated protein sequence chain was also performed against the PDB database to find a suitable template. The sequence chain to be modelled was selected based on similarity search as ''SSFFKLGVPIYQCKGCCFSRAYPTPARSRKTMLVPK NITSESTCCVAKAFIRVTVMGNIKLENHTQCYCSTCY HHK'' and modelled using the four selected templates (pdb1dz7A_0, pdb1e9jA_0, pdb1hd4A_0 and pdb1xwdA-0) with an identity of 73.3 percent. The mean deviation between the four templates on this chain was 6.123312 Å . A Ramachandran plot of the predicted eCG alpha model (Additional file 1: Figure S5) indicated the region of possible angle formations by u (phi) and w (psi) angles. Ramachandran plot of the eCG alpha 3D model showed 95.4% residues in favoured and allowed regions, whereas 3 residues were in disallowed regions.
The ProSA was used to check the 3D homology model of eCG alpha for potential errors. As illustrated in the Additional file 1: Figure S6, the ProSA presented its two characteristics, i.e. the Z-score and a plot of its residue scores which shows local model quality by plotting energies as a function of amino acid sequence position. The ProSA Z-Score of -2.29 (highlighted as black dot) and the better residue scores plot were indicative of the modal quality. ProSA-web z-score plot showed that the z-score of the model was within the range of scores for similar protein structures evident by NMR spectroscopy and X-Ray crystallography. ProSAweb plot of residue scores of the modelled structure showed the local model quality by plotting energies as a function of amino acid sequence position. No positive values corresponding to problematic parts of the input structure in window size 40 which indicates average energy over each 40 residue fragment (thick line). The second line with a smaller window size of 10 residues as background (thin line) also exhibited a better local model quality.
The eCG alpha model was also assessed for the quality and structural features by SWISS-MODEL (Additional file 1: Figure S7). The ANOLEA (atomic non local environment assessment) is used to assess packing quality of the models. The program performs energy calculations on a protein chain, evaluating the ''Non-Local Environment'' (NLE) of each heavy atom in the molecule. QMEAN is a composite scoring function for both the estimation of the global quality of the entire model as well as for the local per-residue analysis of different regions within a model. GROMOS is a general-purpose molecular dynamics computer simulation package for the study of biomolecular systems and can be applied to the analysis of conformations obtained by experiment or by computer simulation. The y-axis of the plot represents the energy for each amino acid of the protein chain. Negative energy values (in green) represent favourable energy environment whereas positive values (in red) unfavourable energy environment for a given amino acid.
The QMEAN score of the eCG alpha homology model was matched to scores of X-ray crystallography resolved reference structures. A highly resolved structure has the zero average Z-score. A total QMEAN Score of 0.438 (Zscore: -2.00) for estimated absolute quality was found for the eCG alpha indicating a fairly good model was also evident (Additional file 1: Figure S8). The QMEAN score along with residue error plot data estimated per-residue inaccuracies along the sequence; support the validity of the predicted model.
The homology model was thereafter visualized by 3-D Molecule Viewer (a component of Vector NTI Advance 11.5.2). The 3D Homology Model (Fig. 2b) of the eCG alpha subunit illustrates the location of the three alpha subunit loops: Loop 1, Loop 2 and Loop 3. This structure also illustrates the location of the region with sequences ''CCFSRA'' that probably may act as a binding antagonist. The eCG alpha contains the two conserved motifs with ''CKGCCFSRAYPTP' ' &''NHTQCYCSTCYHHK'' In Silico Pharmacol. sequences respectively. 3D Mol tool was used to calculate and display molecular surface dimensions via the Connolly method.
The binding site analysis of eCG alpha was done by MetaPocket 2.0. The MetaPocket uses the combined analyses by eight methods viz. LIGSITE, PASS, Q-SiteFinder, SURFNET, Fpocket, GHECOM, ConCavity and POCASA and have revealed the predicted binding sites in the eCG alpha (Additional file 1: Figure S9).

Virtual screening and molecular docking
To investigate the molecular binding interactions of ganirelix (Fig. 2a) and eCG alpha molecules and to elucidate the possible molecular mechanism, PyRx Autodock 4 docking analysis was applied. About nine different poses or binding modes were observed. The molecular binding affinity of these structures ranged from -5.0 to -4.5 kcal/mol. The average molecular binding affinity (docking energy) scores of the ganirelix on the eCG alpha were -4.733 kcal/mol (SEM = ±3.99). When all these docked structures were analysed using PyMOL (Fig. 2c and Additional file 1: Figure S10), Dassault Systèmes BIOVIA Discovery Studio Visualizer (v16.1.0.15350) and LIGPLOT ? (Wallace et al. 1995), it was observed that some residues located in the different subsites of eCG alpha are involved in making hydrophobic stacking interactions.
In silico analysis of eCG alpha for the identification of potential B cell epitope The eCG alpha protein was subjected to bioinformatics analyses including hydrophilicity/hydrophobicity, identification of potential B cell epitope or antigenicity/antigenicity propensity score. Using ElliPro (Ponomarenko et al. 2008), 4 structural templates with PDB IDs viz. 1HCN, 1HRP, 4AY9 and 4MQW were used for the homology modelling and identification of epitopes. A 19 amino acid peptide sequence (KTMLVPKNITSESTCCVAK) and a thirteen amino acid peptide sequence (GNIKLENHTQCYC) seems promising linear epitopes (Fig. 4a). Four clusters were predicted to form conformational B cell epitope or discontinuous epitope. Out of these four, most promising residues clusters are shown in the Additional file 2: Table S2. 2D Score Chart or protrusion index chart for the eCG alpha protein sequence is exhibited in the Fig. 4b. The SCRATCH Protein Predictor (accessible online through http://scratch. proteomics.ics.uci.edu/) (Cheng et al. 2005) was also used for ascertaining the antigenicity propensity score. We also analyzed the eCG alpha protein sequence using various methods for helical wheel diagram presenting the distribution of amino acids spatially in a 2D graph (Additional file 1: Figure S11), secondary structure, hydropoathy, antigenicity, amphilicity, surface probability and flexibility (Fig. 4c) by using the Protean Software (for Protein Structure Analysis and Prediction) of the DNASTAR Lasergene software suite version 7.1.  Predicted discontinuous epitope region with residue cluster (L63, E64, N65, H66, T67, Q68, C69, Y70, C71) is highlighted as trace and yellow colour. b 2D Score Chart or protrusion index chart for the eCG alpha protein sequence. The prominent epitope regions with amino acid sequences ''KTMLVPKNITSESTCCVAK'' and ''GNIK-LENHTQCYC'' respectively are shown as yellow shaded regions and circled in yellow. c eCG alpha protein structure analysis and prediction by Protean software. The analyses were made for the secondary structure, hydropathy, antigenicity, amphiphilicity, surface probability and flexibility In Silico Pharmacol.

Discussion
To strengthen the research primarily targeting the gonadotropin therapy for fertility augmentation and in assisted reproductive technologies, we hypothesize that the alpha subunit of equine chorionic gonadotropin hormone plays an imperative role. Many workers across the globe reported the studies on alpha-subunit of pituitary glycoprotein hormones. Chopineau et al. (2004), reported the identification of aminoacids in the alpha-subunit first and third loops, that are crucial for the hetero-specific follicle-stimulating hormone activity of equid luteinizing hormone/gonadotropin. It was precisely shown that which amino-acids of the alpha-subunit are implicated in the interaction with gonadotropin receptors. Some earlier studies based on photo affinity labelling (Hong et al. 1999;Sohn et al. 2002) immunological studies (Moyle et al. 1982) and mutational studies (Yoo et al. 1991;Zeng et al. 1995;Arnold et al. 1998) demonstrated that the alpha subunit also interacts with the receptor. However, we report for the first time, the predicted three dimensional structures of the mature alpha subunit of eCG and its molecular docking with ganirelix which may play role in elaborating the responsibility played by alpha subunit of eCG in the process of reproduction and maintenance of pregnancy in equines as well as other species.
In many of the preliminary studies while evaluating the application of the GnRH antagonists to clinical practice, GnRH antagonist's capability to block the LH surge in a reproductive cycle was also witnessed. The obstruction of the pituitary GnRH receptors caused the hindrance of the LH surge as well as the interruption of FSH production and further arrest of follicular development (Ditkoff et al. 1991;Kettel et al. 1991). It also became evident that on application of GnRH antagonists to natural cycle IVF, addition of exogenous gonadotropins was needed to maintain follicular growth. It is worthwhile to mention that GnRH and eCG have been used in timed artificial insemination (TAI) protocols to improve the pregnancy rate (PR) in cattle (Gaievski et al. 2015). Therefore, it was interesting to investigate the role of GnRH antagonist such as ganirelix here. Ganirelix/Ganirelix acetate (or diacetate) (Orgalutran Ò or Antagon TM ) is a synthetic decapeptide and derived from GnRH with amino acid substitutions at positions 1, 2, 3, 6, 8, and 10. Deemed as an injectable competitive gonadotropin-releasing hormone antagonist (GnRH antagonist), it is primarily used in assisted reproduction to control ovulation. Ganirelix acetate (Antagon TM ) was launched in Europe in 2001 by Organon for the treatment of fertility disorder. In 1999 Organon received US FDA approval for the drug for inhibiting premature LH surges in women undergoing multifollicular ovarian stimulation (Papanikolaou et al. 2005). Ganirelix has been used as a pharmacological tool to prevent ovulation, until it is prompted by injecting human chorionic gonadotrophin (hCG) with which it is known to interact. It has been the basis of treatment regimen for IVF patients who are undergoing ovarian stimulation and might be restricted to days spanning LH premature surge (Borm and Mannaerts 2000). All the earlier works have reiterated that use of GnRH antagonists such as ganirelix suppress the gonadotropins by blocking the GnRH receptor.
Based in the in silico computational methods, including molecular docking, we pursued to gain an insight into the structural mechanism of eCG recognition by ganirelix. Employing the present molecular docking study, it was evident that ganirelix can also interact with eCG alpha (Figs. 2, 3) and the interacting numbers of amino acids and conventional hydrogen bonds and hydrophobic interactions might be critical factors for regulating target protein activity. Docking with ganirelix was predictive of a possible preference for the eCG alpha subunit which is common to gonadotropins. We observed nine different poses or binding modes for the docked structure. The molecular binding affinity of these structures ranged from -5.0 to -4.5 kcal/mol. The average molecular binding affinity (docking energy) scores of the ganirelix on the eCG alpha were -4.733 kcal/mol (SEM = ± 3.99). These were indicative of the fact that the binding free energies and affinity can fluctuate in silico, signifying that ganirelix may access or fit the eCG alpha through several modes. The docking study has revealed the stabilization of eCG-ganirelix complex through several hydrophobic interactions and hydrogen bonds. These data also suggest that computer aided drug design process using PyRx, and Discovery Studio tools is highly reliable and can be a good example for in identifying the interaction between eCG alpha and ganirelix. Previous studies have shown that ganirelix suppress gonadotropins by blocking GnRH receptor and mentioned earlier in the manuscript. Also, the alpha subunit is common to all the gonadotropins. Complementary to these facts, our data of molecular docking study indicate for an alternate route of gonadotropin (in our case, that of eCG alpha) sequestration by interactions of alpha subunit with the ganirelix. This may elucidate an alternative possibility for ganirelix mediated gonadotropin suppression directly by binding apart from GnRH receptor blocking. This hypothesis may be subject to further studies. In our study, we also report the putative antigenic sites for alpha-eCG to act as immunogen or as an analog to generate antibodies required in various studies related to reproduction in farm animals. Putative chromosomal localization of the eCG alpha was also determined through in silico ways.

Conclusions
In summary, the equid alpha subunit is an absolute requirement for expression of both the LH and FSH activities of equine chorionic gonadotropin hormone (Chopineau et al. 1997) and through various in silico analysis of eCG alpha sequence, we attempted for gaining structural insights as well as providing an alternate route of gonadotropin suppression. It has to be noted here that our results are largely based on the in silico analysis and data mining, and have not been proven experimentally. However, they still provide an alternate explanation for the gonadotropin suppression or sequestration through gonadotropin alpha subunit and ganirelix interactions. Future studies will be helpful in elucidation of the eCG alpha and ganirelix interactions.