Maximum‐likelihood approaches reveal signatures of positive selection in BMP15 and GDF9 genes modulating ovarian function in mammalian female fertility

Abstract Bone morphogenetic proteins (BMPs) and the growth factors (GDFs) play an important role in ovarian folliculogenesis and essential regulator of processes of numerous granulosa cells. BMP15 gene variations linked to various ovarian phenotypic consequences subject to the species, from infertility to improved prolificacy in sheep, primary ovarian insufficiency in women or associated with minor subfertility in mouse. To study the evolving role of BMP15 and GDF9, a phylogenetic analysis was performed. To find out the candidate gene associated with prolificacy in mammals, the nucleotide sequence of BMP15 and GDF9 genes was recognized under positive selection in various mammalian species. Maximum‐likelihood approaches used on BMP15 and GDF9 genes exhibited a robust divergence and a prompted evolution as compared to other TGFβ family members. Furthermore, among 32 mammalian species, we identified positive selection signals in the hominidae clade resulting to 132D, 147E, 163Y, 191W, and 236P codon sites of BMP15 and 162F, 188K, 206R, 240A, 244L, 246H, 248S, 251D, 253L, 254F and other codon sites of GDF9. The positively selected amino acid sites such as Alanine, Lucien, Arginine, and lysine are important for signaling. In conclusion, this study evidences that GDF9 and BMP15 genes have rapid evolution than other TGFß family members and was subjected to positive selection in the mammalian clade. Selected sites under the positive selection are of remarkable significance for the particular functioning of the protein and consequently for female fertility.

oogenesis in oocyte preparation to support embryonic development (Sánchez & Smitz, 2012). However, the oocyte development within the follicular structure involves uninterrupted two-way dialogs between the oocyte and cumulus complex, as well as the other somatic cells in the follicles, such as the granulosa and theca cells . The granulosa cells are important components of the follicular environment for the achievement of oocyte capability, ovulation, and fertilization as they regulate the expression of luteinizing hormone receptor (LHR), production of estradiol and progesterone, Inhibin A and B secretion, and production of several transcripts vital proteins (Ceko et al., 2014;Hatzirodos et al., 2014). The bone morphogenetic protein 15 and growth differentiation factor 9, belong to the TGFβ superfamily, act on the granulosa cells to regulate oocyte growth and differentiation. These are expressed in all phases of follicle development in the mammalian species and are involved in steroidogenic regulation of granulosa cells (Dias, Khan, Adams, Sirard, & Singh, 2014;Peng et al., 2013). A recent phylogenetic analysis revealed that the GDF9 and BMP15 genes diverged promptly and showed fast evolution as compared to other BMPs. However, only BMP15 was acquiesced to a positive selection in the mammalian clade (Auclair et al., 2013). The candidate gene associated with prolificacy in goats, the most part nucleotide sequence of genes, including GDF9 and BMP15, were recognized in various goat breeds for their possible association to the high fertility (He, Ma, Liu, Zhang, & Li, 2010). As BMP15 and GDF9 play an important role in fertility and prolificacy, it therefore means that consideration should be made to the gene sequences (using both wet and dry lab methods), that are generally accountable for the detected phenotypic variations. A good understanding of these genes sequences will help in recognizing the modifications accountable for various factors ascribed to the gene. The objective of this study was to explore the selection signatures using maximum-likelihood approaches on the bases of molecular genetic difference of BMP15 and GDF9 among mammalian species with a view to provide applicable genetic information for marker assisted selection in the different species.

| Sequence analysis and data set preparation
The coding nucleotide and amino acid sequences of BMP15 and GDF9 genes used in this analyses were recovered from GenBank (www.ncbi.nlm.nih.gov/genbank), Ensembl (http://useast.ensembl. org/index.html), and UniProt (http://www.uniprot.org), and recovered sequences were aligned using ClustalOmega, executed in MEGA 6.0 program (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013), followed by manual adjustment. The phylogenetic tree of BMP15 and GDF9 genes was generated with MEGA 6.0 based on maximumlikelihood method. The taxa clustered together in the bootstrap test 1,000 replicates based on maximum-likelihood method selecting the topology with higher log likelihood value and the branch length measured in the number of substitutions per site. Asif, Awais, Qadri, Ahmad, & Du, 2017). The accession numbers and identification of species used for BMP15 and GDF9 are listed in Table S1.

| Codon-based positive selection analysis
In order to recognize particular codons under positive selection of mammalian BMP15 and GDF9 sequences, the different ω ratios (dN/ dS) were compared using two maximum-likelihood approaches, the HyPhy package implemented in the DATAMONKEY Web Server (http://www.datamonkey.org/) (Poon, Frost, & Pond, 2009) and CODEML implemented in PAML version 4 (Yang, 2007) being considered in the analysis the results where ω ratios were significantly higher than 1.
The analysis involves of two main steps. In the first step, we used the maximum-likelihood ratio test to find out positive selection, that is, manifestation of sites with ω > 1. We achieved this by comparing a (null) model that does not allow for sites with ω > 1 and a general The second main step is to find out amino acid subjected to positive selection when their presence is confirmed by likelihood test.
It is inferred by using the Bayes theorem to estimate the posterior probabilities for each site, from the different ω classes (Bielawski & Yang, 2003). The amino acid residues with high probabilities having ω > 1 are probably found to be under selection. Amino acid locations subjected to positive selection were drawn onto the crystal structure using Phyre (http://www.sbg.bio.ic.ac.uk/phyre2/html) and Swiss model (http://swissmodel.expasy.org) online programs (Kelley & Sternberg, 2009). The level of evolutionary conservation amino acid/ nucleic acid positions in protein was predicted using the bioinformatics tool, the ConSurf server (http://consurftest.tau.ac.il) based on phylogenetic relationship between sequences (Glaser et al., 2003). To further ratify codon sites under the selection pressure, aligned codon sequence of BMP15 and GDF9 was tested in the Selecton, version 2.2 (http://selecton.tau.ac.il/) that allows shifting the ω ratio between different codons within the aligned sequence and this was measured by maximum-likelihood test through Bayesian inference method (Yang, Liao, Zhuang, & Zhang, 2012). Moreover, the selecton results are shown with color scales demonstrating various types of selection.

| Protein-protein interaction network analysis
To further expose the molecular functioning mechanisms of BMP15 and GDF9, we recognized the vital genes interacted with BMP15 and GDF9 followed by protein-protein interaction linkage analysis sing STRING (version 9.1, http://www.string-db.org/) (Franceschini et al., 2012) which is web server and biological databank which comprises widely anticipated and identified interaction data. The interactions between protein encoded by the BMP15 and GDF9 were sought. The pooled score <0.4 was used as the cutoff standard. The bioinformatics databank as an open access source comprises interactions of proteins involved in various pathways. The middle nodes indicate the protein which own essential biological function and are highly connected, were identified by estimating the betweenness value and the number of line connections between proteins of each node. The network was constructed using STRING and was visualized by Cytoscape software (http://www.cytoscape.org/) (Li, Zhao, Wang, Zong, & Yang, 2017).

| RESULTS
The average ω ratio (dN/dS) across the sites and lineage are <1 for BMP15 and GDF9 (Table 1). However, these proteins subjected to positive selection and might have conserved amino acid exposed to purifying selection and have ω less than one. The level of evolutionary conservation amino acid/nucleic acid positions in protein was predicted using the ConSurf server (http://consurftest.tau.ac.il) based on phylogenetic relationship between sequences and Selecton version 2.2 that implements the mechanistic empirical combination (MEC) model for estimating adaptive selection pressure at different codons.
A huge number of conserved amino acids would mask the positive selection signals, and we found positive selection on variable amino acids which were exposed or buried residues according to the neural network algorithm BMP15 and GDF9.
As a refined selection test, M8 was compared with M7. M8 was significant and fit the data more significantly than M7. We found pos-

| Positive selection on amino acid positions
Indicating the positions of amino acids evolutionary conservation is important for maintaining the protein structure and function. Therefore, detection of selected sites may enlighten the selection forces and The proportion of sites under positive selection (p1), or under selective constraint (p0), and parameters p and q for the beta distribution. Parameters indicating positive selection are in bold. p: significant at 5% level; p: significant at 1% level. Sites potentially under positive selection identified under model M8 are listed according to the human sequence numbering. Positively selected sites with posterior probability 0.9 are italicized, 0.8-0.9 in bold, and 0.5-0.7 in plain text. The test statistic 2Δl is compared to a χ 2 distribution with 2 df, critical values 5. 99, 9.21, and 13.82 at 5%, 1%, and 0.1% significance, respectively. **Significant at 1% level; *Significant at 5% level.
detects the functionally significant sites for bone morphogenetic protein interaction. To detect such sites, we utilized the Bayes method to estimate the posterior probabilities for each site. The sites with more probabilities are expected to be positively selected with ω > 1. Using Using BEB analysis for 391 amino acids of BMP15, seventeen were found under positive selection but no site could be identified at 99% or 95% posterior probability. GDF9 had 453 amino acid sites, and only seven amino acids showed positive selection (Table 2; Figure 1a,b).
Regarding PAML false positive results, we also performed positive selection test in the selecton server (http://selecton.tau.ac.il/) that uses the Mechanistic Empirical Combination (MEC) model for estimating the selection pressure at particular codons. The MEC model takes into account the variances between amino acid substitution rates.
Adaptive selection pressure was found at various codons in BMP15 ( Figure 2) and GDF9 (Figure 3), identified under positive selection.

| Protein-protein interaction network
By searching BMP15 and GDF9 encoded protein to the STRING databank, various PPI pairs were found. The PPI network had 21 nodes

| DISCUSSION
The BMP15 and GDF9 contribute in the development of primary follicle from primordial follicle and play an essential role in the subsequent phases of follicular growth and maturation, enhancing the expression   (Meslin et al., 2012). We accomplished positive selection analyses on BMP15 and GDF9. We used coding nucleotide sequences of 32 mammalian species which were evaluated by branch-site models in PAML package (Yang, 2007), in order to investigate whether the diverse species in the phylogenetic kinship experienced selection pressure and to identify clues of native periodic positive selection. These evaluates were performed on the complete sequences, the mature, and the pro-region form of BMP15 and GDF9. We studied all branches of the phylogenetic tree, and various codon sites were found under positive selection in mammalian clade (Figure 1a,b and Table 1). We  (Ratnakumar et al., 2010). It has been revealed that in hominidae clade, the third codon position of BMP15, the G-C content (53%) is higher than the gene in these taxa (46%) (Romiguier, Ranwez, Douzery, & Galtier, 2010). However, a current study (Gharib & Robinson-Rechavi, 2013)  and GDF9 under positive selection. We found positive selection signals at 131D, 163Y, 191W, 147E, and 236P codon sites of BMP15 (Figure 2) and 162F, 188K, 206R, 240A, 244L, 246H, 248S, 251D, 253L, 254F, and other codon sites of GDF9 (Figure 3). The positively selected amino acid sites such as alanine, leucine, arginine, and lysine are important for signaling. Among 24 mammalian species, positive selection signals were detected signals in the human BMP15 (Auclair et al., 2013). Therefore, some amino acid sites under positive selection are essential for particular role of protein and consequently for female fertility (Persani, Rossetti, Di Pasquale, Cacciatore, & Fabre, 2014).
Moreover, the transformed was more effective than wild type in deterring the progesterone production in granulosa cells of ovine cell culture. It is evidenced that BMP15 has evolved faster than other TGF family members and was acquiesced to positive selection in mammalian clade (Persani et al., 2014). The sequence alignment reveals that BMP15 belongs to the TGF family of cytokines due to the existence of "cystine-knot" motif, together with GDF9 as the next homolog.
Like other TGF members, these molecules are first decoded as signal peptide with a pre-pro-peptide at N-terminal followed by pro-domain and the C-terminal mature sphere that delivers the biological action (Chang, Brown, & Matzuk, 2002). The particular functions of TGF superfamily member's pro-domains are unidentified. The proteolytically treated pro-region and mature regions of BMPs remain attached noncovalently, usually networking with the extracellular matrix (Sengle, Ono, Sasaki, & Sakai, 2011). Regarding the BMP15 pro-region, it drives the dimerization and subsequent secretion of the mature dimers and may help to alleviate the mature region bioactivity (Pulkki et al., 2011).
In our study, positive selection at BMP15 and GDF9 was found with ω > 1 (Table 1). This indicates that nonsynonymous (dN) sites evolved quicker than those of synonymous sites and positive Darwinian selection influence purifying/balancing selection favored new variants and raised allelic polymorphism (Bergström & Gyllensten, 1995) which in turn might introduce an alteration in protein structure validation, thus affecting the signaling pathways (Cui et al., 2009). The changing amino acid substitutions across species might be the result of discrete divergence from their common lineages, which agrees with former submissions. The orthologs differ from their most recent common forebear have different evolutionary routes which may direct the deviations in the selective constraints on homologous sites (Marini, Thomas, & Rine, 2010). Our analysis of bone morphogenetic protein genes involved in recent selection provides insights of some biological processes that have been objectives of selection in current and much longer evolutionary timescales (Voight, Kudaravalli, Wen, & Pritchard, 2006). Hence, understanding the story of selection in mammalian genome promises to be an interesting research area for years to come.

| CONCLUSIONS
In present study, we investigated that BMP15 and GDF9 genes have evolved rapidly than other TGFß superfamily members and was allowed to selection pressure in mammalian clade. Some positively selected amino acid sites are of significant for the particular role of protein and consequently for female fertility. We presented comprehensive analyses in determination of genetic importance of BMP15 and GDF9. Selection analyses of bone morphogenetic proteins modulating reproduction could facilitate the development of unique strategies that may help for genetic improvement and select individuals with high breeding values for traits of interest as parentages to produce the next generation.

ACKNOWLEDGMENTS
Author is thankful to anonymous reviewers for their valuable comments, suggestions, and critical reading of the manuscript.