Pineal gland transcriptomic profiling reveals the differential regulation of lncRNA and mRNA related to prolificacy in STH sheep with two FecB genotypes

Long noncoding RNA (lncRNA) has been identified as important regulator in hypothalamic-pituitary-ovarian axis associated with sheep prolificacy. However, little is known of their expression pattern and potential roles in the pineal gland of sheep. Herein, RNA-Seq was used to detect transcriptome expression pattern in pineal gland between follicular phase (FP) and luteal phase (LP) in FecBBB (MM) and FecB++ (ww) STH sheep, respectively, and differentially expressed (DE) lncRNAs and mRNAs associated with reproduction were identified. Overall, 135 DE lncRNAs and 1360 DE mRNAs in pineal gland between MM and ww sheep were screened. Wherein, 39 DE lncRNAs and 764 DE mRNAs were identified (FP vs LP) in MM sheep, 96 DE lncRNAs and 596 DE mRNAs were identified (FP vs LP) in ww sheep. Moreover, GO and KEGG enrichment analysis indicated that the targets of DE lncRNAs and DE mRNAs were annotated to multiple biological processes such as phototransduction, circadian rhythm, melanogenesis, GSH metabolism and steroid biosynthesis, which directly or indirectly participate in hormone activities to affect sheep reproductive performance. Additionally, co-expression of lncRNAs-mRNAs and the network construction were performed based on correlation analysis, DE lncRNAs can modulate target genes involved in related pathways to affect sheep fecundity. Specifically, XLOC_466330, XLOC_532771, XLOC_028449 targeting RRM2B and GSTK1, XLOC_391199 targeting STMN1, XLOC_503926 targeting RAG2, XLOC_187711 targeting DLG4 were included. All of these differential lncRNAs and mRNAs expression profiles in pineal gland provide a novel resource for elucidating regulatory mechanism underlying STH sheep prolificacy.


Background
Reproduction, one of the major factors significantly affecting profitability of sheep production, is a complicated physiological process and determined by the integrated hypothalamic-pituitary-ovarian axis in breeding season [1]. Reproductive traits like litter size directly determine benefit of sheep production, are controlled by poly-gene at the micro level. How to undertake at molecular level to improve reproduction, thereby serving macro production is a hotspot in recent years. BMPRIB, BMP15 [2] and GDF9 [3] are major fecundity genes which significantly influence sheep prolificacy. FecB is a mutation in BMPRIB occurring in base 746 from A to G, one copy of this mutation significantly increases ovulation rate in sheep about 1.5 and two copies by 3.0 [4]. To date, this mutation has been detected in diverse sheep species such as Booroola Merino sheep (Australia) [5], Small Tail Han (STH) and Hu sheep (China) [6]. Wherein STH sheep is a famous native breed with year-round estrus and high fecundity, being officially recognized as one of the polytocous breeds in China. The average litter size and lambing rate of STH sheep are 2.61, 286.5%, respectively [7]. There are three genotypes based on effects of FecB mutation in STH sheep, namely FecB BB (with two-copy FecB mutation), FecB B+ (with one-copy FecB mutation) and FecB ++ (with no FecB mutation), which is closely correlated with litter size of ewes [8]. Therefore, this breed can be used as a classic model for study molecular mechanism of FecB gene regulation of reproductive traits in sheep.
Long noncoding RNA (lncRNA) is polymerase II transcript with length longer than 200 nucleotides that lacks the protein coding ability, its expression has high tissue specificity and distributes in cytoplasm or nucleus [9]. LncRNA is proposed to be the largest transcript class in mammalian transcriptome [10], less than 2% of mammalian genome actually code for protein, 70-90% is transcribed in some context as lncRNA, originally thought to be 'transcriptional noise' in genome. Subsequently, studies have gradually shown that lncRNA exerts important roles in various biological processes such as cell proliferation, apoptosis and differentiation [11], signal transduction [12], immune regulation [13]. In terms of reproduction, there have many reports on lncRNA. For example, Miao et al. (2017) compared transcripts in ovaries of low fecundity ewes and high fecundity ewes, found that differentially expressed (DE) lncRNA significantly enriched in the oxytocin signaling pathway [14]. Then, Feng et al. (2018) identified 5 lncRNAs and 76 mRNAs in ovaries of Hu sheep with high and low prolificacy, respectively [15]. Yang et al. (2020) analyzed lncRNA and mRNA in male sheep pituitary and found that 5 candidate lncRNAs and their targeted genes enriched in growth and reproduction related pathways [16]. Su et al. (2020) screened differential lncRNA through high-throughput sequencing, concluded that XLOC-2222497 and its target AKR1C1 could interact with progesterone in porcine endometrium for controlling pregnancy maintenance [17]. These studies indicated the presence and role of lncRNA in reproductive tissues. It is known that the sheep pineal gland as an important reproductiverelated gland, that is closely related to hormone and signal transduction. However, studies on function of sheep lncRNA in this organ are limited.
In light of this, the study presented herein was focused on analyzing transcriptomics of pineal gland in STH sheep with FecB BB (MM) and FecB ++ (ww) genotypes, to determine the DE lncRNAs and genes, and predict their potential function that related to reproduction. Which is essential for better understanding the molecular mechanisms by lncRNAs regulate sheep reproduction with different genotypes, also providing insight for other female mammals.

Summary of raw sequence reads
After removing low-quality sequences, a total of 288, 342,450, 250,073,062, 289,224,844 and 277,834,922 clean reads with greater than 91.91% of Q30 were obtained in MM_F, MM_L, ww_F and ww_L, respectively. Approximately 86.10 to 92.89% of the reads were successfully mapped to the Ovis aries reference genome (Table 1).

GO analysis of the biological function of DE lncRNAs and mRNAs
GO annotation enrichment was used to describe functions of the DE lncRNAs and mRNAs involved in cellular components, molecular function and biological processes, as shown in Fig. 3. Between MM_ FP and MM_LP, targeted genes for DE lncRNAs were most enriched, and the terms were related to regulation of trans-membrane transport, antigen processing and presentation, immune system process. DE mRNAs were most enriched, the meaningful terms were related to the regulation of C-terminal protein methylation, C-terminal protein amino acid modification, post-translation protein modification, cellular macromolecular complex assembly and cellular Between MM_FP and ww_FP, targeted genes for DE lncRNAs were enriched, the terms were related to regulation of protein complex assembly and biogenesis, protein complex subunit organization, spindle assembly involved in mitosis process. DE mRNAs were most enriched, the meaningful terms were related to regulation of secondary metabolic and biosynthetic process, viral protein processing, single-organism biosynthetic process (Fig. 3b, Supplementary material 5B, 6B). C-terminal protein methylation C-terminal protein amino acid modification post-translational protein modification cellular metabolic process cellular macromolecular complex assembly cellular macromolecule metabolic process protein polymerization organic hydroxy compound metabolic process macromolecular complex subunit organization macromolecular complex assembly cellular protein complex assembly peroxisome fission barbed-end actin filament capping nucleic acid metabolic process nucleobase-containing compound metabolism mRNAs -Log10(Pvalue) 0 1 2 3 protein complex assembly protein complex biogenesis spindle assembly involved in mitosis protein complex subunit organization macromolecular complex assembly cellular component assembly macromolecular complex subunit organization viral DNA genome packaging endosome transport via multivesicular body sorting response to pheromone cellular component biogenesis protein deubiquitination protein modification by small protein removal cytoskeletal anchoring at plasma membrane mitotic spindle organization lncRNA targets secondary metabolic process secondary metabolite biosynthetic process viral protein processing single-organism biosynthetic process mycotoxin metabolic process mycotoxin biosynthetic process aflatoxin biosynthetic process aflatoxin metabolic process organic heteropentacyclic compound metabolism organic heteropentacyclic compound biosynthesis polyketide metabolic process polyketide biosynthetic process regulation of transcription, DNA-dependent regulation of RNA biosynthetic process regulation of RNA metabolic process mRNAs -Log10(Pvalue) RNA methylation metabolic process organic substance metabolic process ncRNA processing rRNA modification rRNA methylation RNA processing macromolecule methylation rRNA processing rRNA metabolic process gene expression cellular metabolic process macromolecule biosynthetic process homeostatic process primary metabolic process mRNAs -Log10(Pvalue) 0 1 2 3 4 immune response immune system process antigen processing and presentation glycerophospholipid metabolic process glycerolipid metabolic process phosphatidylinositol metabolic process transmembrane transport microtubule-dependent transportation microtubule-dependent intracellular transport of viral material to nucleus intracellular transport of viral material cellulose metabolic process cell cycle checkpoint response to ionizing radiation cellular response to abiotic stimulus lncRNA targets -Log10(Pvalue) D ww_F_P vs ww_L_P 0 1 2 3 nucleosome assembly chromatin assembly nucleosome organization chromatin assembly or disassembly regulation of cell shape regulation of biological quality response to extracellular stimulus cellular response to extracellular stimulus cellular response to external stimulus regulation of cell morphogenesis protein-DNA complex assembly protein-DNA complex subunit DNA packaging photosynthesis macromolecular complex assembly mRNAs -Log10(Pvalue) Between MM_LP and ww_LP, targeted genes for DE lncRNAs were enriched, the terms were mainly related to regulation of single organism signaling, signal transduction, cellular response to stimulus and cellular communication. DE mRNAs were enriched, the meaningful terms were related to regulation of RNA methylation, metabolic process, organic substance metabolic process (Fig. 3c, Supplementary material 5C, 6C).
Between ww_FP and ww_LP, targeted genes for DE lncRNAs were enriched, the terms were related to regulation of immune response, glycerolipid metabolic process, cellular response to abiotic stimulus. DE mRNAs were enriched, the terms were related to regulation of nucleosome and chromatin assembly, nucleosome organization process (Fig. 3d, Supplementary material 5D, 6D). Between MM_FP and ww_FP, DE lncRNA targeted mRNAs were associated with pathways such as phosphatidylinositol signaling system, TNF signaling and p53 signaling pathway (Fig. 5a, Supplementary Between MM_LP and ww_LP, DE lncRNA targeted mRNAs were associated with pathways such as olfactory transduction, gap junction and thyroid hormone signaling pathway (Fig. 6a, Supplementary material 7C). With regard to DE mRNAs, which were enriched in ubiquitin mediated proteolysis, vasopressin-regulated water reabsorption, non-homologous end-joining and cell cycle (Fig. 6b, Supplementary material 8C).
Between ww_FP and ww_LP, DE lncRNA targeted mRNAs were associated with pathways such as cell adhesion molecules (CAMs), GSH metabolism and tight junction pathway (Fig. 7a, Supplementary   material 7D). DE mRNAs were enriched in spliceosome, notch signal pathway, RNA polymerase and adherens junction, ras signaling pathway (Fig. 7b,  Supplementary material 8D).
Hence, we acquired DE mRNAs closely related to reproductive signal pathways on the whole from above four comparison groups (Table S1).

Discussion
Studies have found that lncRNA is involved in multiple reproductive functions such as spermatogenesis [18], placentation [19], signaling pathway of sex hormone response [20,21] and gonadgenesis [22]. It is known that the melatonin synthesized in pineal gland is closely related to the estrus cycle [23]. Herein, the study focused on examining expression profiles of pineal gland lncRNAs and mRNAs in sheep with two genotypes at different phases of estrous cycle using RNA-Seq technology. Analysis of relationship between DE lncRNAs and mRNAs by generating a co-expression network. To our knowledge, this is the first genome-wide analysis of pineal gland in sheep with different genotypes, and might provide valuable resource for searching functional lncRNAs associated with sheep prolificacy.
In present study, we screened 21,282 lncRNAs and 43, 674 mRNAs. LncRNAs have synergetic relationship with mRNAs as most lncRNAs are located near proteincoding genes [24,25]. Additionally, wide location of lncRNAs in chromosomes of sheep indicated its pluripotency. Obviously, distribution ratio of lncRNAs and mRNAs on Chr2 (NC_019459.2), Chr3 (NC_019460.2), Chr1 (NC_019458.2) were greater than those on other chromosomes, which could be explained by close relationship between three chromosomes and pineal gland function. The exon size and ORF length of lncRNAs and mRNAs are mostly within 1000 bp. These results showed the potential lncRNAs were reliable in the pineal gland.
Overall, we screened 135 (39 + 96) DE lncRNAs and 1360 (764 + 596) DE mRNAs in pineal gland at follicular and luteal phases between high and low prolificacy STH sheep (WW vs ww). GO annotation and KEGG enrichment analysis of top 20 terms indicated that DE mRNAs were enriched in reproduction-related pathways such as GnRH, cGMP-PKG, thyroid hormone, MAPK, phototransduction, circadian rhythm, steroid biosynthesis, hippo, mTOR and melanogenesis. It is well known that productive cycle of mammals is regulated through association or acting alone of hypothalamic-pituitary-thyroid (HPT) axis and hypothalamic-pituitary-gonadal (HPG) axis [26,27]. In the HPT axis, thyrotropin-releasing hormone (TRH) produced in hypothalamus stimulates pituitary to secrete thyroid-stimulating hormone (TSH), which promotes TH synthesis in the thyroid gland [26,28]. In the HPG axis, GnRH in hypothalamus regulates synthesis and secretion of FSH and LH in the anterior pituitary. These two hormones stimulate gonadal estrogen synthesis by binding to their receptors for affecting development and maturation of follicles and the ewes litter size. Estrogen in turn positively or negatively acts GnRH synthesis, and affects FSHβ gene expression, a hormone specific β subunit that is mainly regulated by GnRH [29,30]. In the process, binding of GnRH to its receptor activates signaling cascades like MAPK, PI3K-Akt [31]. MAPK pathway is essential for cell proliferation and differentiation, survival, death and transformation [32,33]. PI3K-Akt can interact with mTOR pathway to effectively regulate growth hormone in pituitary [34]. Additionally, pathways as hippo modulates organ size growth by controlling stem cell activity, proliferation and apoptosis. For instance, its' effect on the development of pituitary progenitor cells [35]. Our results showed that DE genes like AKT3, MYC, PIK3CB, MAP2K2, PLCB1 and TEAD1 related to thyroid hormone, MAPK, cGMP-PKG, hippo, and up regulated, while CTNNB1, YAP1, PIK3CG, TEAD1, CAMK2A, CACNA1D mainly related to hippo, thyroid hormone, cGMP-PKG, AMPK, GnRH, oxytocin, circadian entrainment, and down regulated, which implied the important roles of these genes mainly involved in regulation of hormone-related pathways. It's worth exploring their function in pineal gland as candidate genes.
Co-expression analysis of differential lncRNA-mRNA and functional prediction of target genes revealed that lncRNA affects sheep fecundity by modulating genes associated with above signaling pathways and biological processes. In FecB BB genotype sheep, XLOC_466330 and the targets (RRM2B, GSTK1) up regulated at follicular phase, which related to GSH metabolism. Whereas XLOC_391199 and the target (STMN1), XLOC_503926, XLOC_517836 and the target (RAG2) up regulated at luteal phase, which mainly enriched in MAPK, FoxO signaling pathways, respectively. In FecB ++ genotype sheep, XLOC_347557 and the target (GPX2), XLOC_ 532771 and the targets (LOC101111397, RRM2B), XLOC_339502 and the target (GPX1), XLOC_028449 and the target (GSTK1) up regulated at follicular phase, which also related to GSH metabolism. Meanwhile, 105, 604,037 and the target (MGST1), XLOC_187711 and the target (DLG4) down regulated at the same phase that related to GSH metabolism and hippo signaling. Wherein GSTK1 and RRM2B involved in GSH metabolism at follicular phase, but their targeted regulators lncRNAs were markedly different among two FecB genotypes. RRM2B gene encodes p53R2, and p53R2 is expressed at all phases of cell cycle to ensure ample supply of mitochondrial DNA [36]. GSTK1 gene encodes a member of GSTK superfamily of enzymes that function in cellular mitochondria and peroxisomes detoxification during GSH metabolism [37,38], a critical pathway protecting cells from free radicals and oxidative damage, could increase intracellular NADPH [39]. With increase of NADPH oxidase, ROS level tend to be low, whereas the level of intracellular ATP enhanced, as well as mitochondrial activity, which promote oocyte maturation [40], and so forth, the other DE genes involved in GSH metabolism were also novel direction of interest for their effects on the downstream reproductive system. Furthermore, DE target genes like STMN1 is a highly conserved gene that codes for cytoplasmic phosphoproteins, acting role in cell cycle progression, signal transduction and cell migration through diverse intracellular signaling pathways. Studies have found the potential role of STMN1 in regulation of hormone secretion in rodent pituitary and insulinoma cell lines [41]. Over-expression of STMN1 stimulates progesterone production by modulating the promoter activity of Star and Cyp11a1 in mouse granulosa cells [42]. Besides, RAG2 is indispensable for generation of antigen receptor diversity in immune cells [43]. We found STMN1, RAG2 were down regulated at follicular phase in FecB BB sheep, and mainly related to MAPK, FoxO signaling pathways, respectively.  Note: "i-" represents the identical information with previous one in the same column DLG4 was down regulated at follicular phase in FecB ++ sheep and enriched in hippo signaling term. As known that DLG4 encodes a member of MAGUK family, is widely expressed and playing an essential role in regulation of cellular signal transduction, circadian entrainment [44]. Taken together, the DE lncRNAs identified in this study might cooperate with their target genes and DE genes to regulate pineal gland physiological function, and involved in hormone synthesis for effecting reproductive cycle and final lambing.

Conclusion
In summary, the pineal gland transcriptomic study reveals differential regulation of lncRNAs and mRNAs  i-Note: "i-" represents the identical information with previous one in the same column related to prolificacy in sheep with different FecB genotyping. We screened several sets of target genes of DE lncRNAs and DE genes under reproductive cycle and genotypes, they were annotated to multiple biological processes such as phototransduction, circadian rhythm, melanogenesis, GSH metabolism and steroid biosynthesis, which directly or indirectly participate in hormone activities to affect sheep reproductive performance. Additionally, we predicted potential role of these DE lncRNAs and constructed network of lncRNAs-mRNAs to expand our understanding. All of these differential lncRNAs and mRNAs expression profiles provide a novel resource for elucidating regulatory mechanism underlying STH sheep prolificacy.

Ethics statement
Experimental

Animals preparation
Animals were from a core population of STH sheep in Luxi district of Shandong province, China. We collected jugular vein blood of healthy non-pregnant sheep aged 2-4 years (n = 890), to identify the FecB genotypes using TaqMan probe [45]. Then, 12 sheep (6 MM and 6 ww, respectively) with no significant difference in age, weight, height, body length, chest circumference and tube circumference were selected for this experiment. Twelve sheep were managed and raised on a farm, with free access to water and feed. All sheep were processed by estrus synchronization with Controlled Internal Drug Releasing device (CIDR, progesterone 300 mg, Inter Ag Co., Ltd., New Zealand) for 12 days. 3 MM and 3

Construction of RNA libraries and sequencing
A total amount of 3 μg RNA per sample was used as input material for the RNA sample preparation. Firstly, rRNA was removed by Epicentre Ribo-zero™ rRNA Removal Kit (Epicentre, USA) and rRNA free residue was cleaned up by ethanol precipitation. Subsequently, libraries were generated using the rRNA-depleted RNA by NEBNext® Ultra™ Directional RNA Library Prep Kit for Illumina® (NEB, USA) following manufacturer's recommendation. After the clustering of the index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit v3-cBot-HS (Illumia), libraries were then sequenced through an Illumina Hiseq 4000 platform and 150 bp paired-end reads were generated.

Reference genome mapping and transcriptome assembly
Raw reads of fast-q format were firstly processed through in-house perl scripts to obtain clean reads. At the same time, Q20, Q30 and GC content of the clean data were calculated. All downstream analyses were based on the high quality clean reads. HISAT2 v. 2.0.4 was used to align paired-end clean reads of each sample to sheep reference genome Oar v. 4.0 [46]. The mapped reads of each sample were assembled by StringTie v. 1.3.1 [46].

Identification of potential lncRNA candidates
Potential lncRNA candidates were identified using the following workflow. (1) Transcripts with uncertain chain direction were removed by Cuffmerge. (2) Transcripts length > 200 nt with exon number ≥ 2 were selected. (3) Cuffcompare v. 2.1.1. was used to compare different classes of class_code annotated by "i", "u" and "x" that were retained, which corresponded to intronic, intergenic, and anti-sense transcripts, respectively. (4) [49] were used to predict the protein-coding potential. After that, Pfam was implemented to exclude transcripts that overlapped with any protein-coding genes. Intersection of these transcripts without coding potential was used as the lncRNA data set. Additionally, mRNAs were obtained from the same RNA-seq libraries in this study.

Analysis of differentially expressed genes
The fragments per kilobase of transcript per million reads mapped (FPKM) value was used to estimate the expression levels of transcripts (lncRNAs and mRNAs) [50]. For experiments with three biological replicates, the R package DEseq2 was used to identify differentially expressed transcripts after a negative binomial distribution [51]. LncRNAs and mRNAs with P-value < 0.05 and a fold change (FC) > 2.0 were considered as differentially expression between two groups of data.

Bioinformatics analysis
LncRNA targets could be predicted by the correlation or co-expression of lncRNA and mRNA expression levels between samples. Among them, Pearson correlation coefficient (PCC) was used to analyze the correlation between lncRNA and mRNA among samples, mRNAs with |PCC| ≥0.95 for functional enrichment to predict lncRNAs [52]. Statistical enrichment of DE lncRNA targets and DE mRNAs in GO annotation and KEGG pathway were evaluated, P-value ≤0.05 defined as the significant threshold, significance of the term enrichment analysis was corrected by FDR and corrected P-value (Q-value) was obtained [53].

Construction of co-expression networks
To predict function of DE lncRNAs and their target genes in sheep reproduction, a network based on lncRNAs and mRNAs was bulit using Cytoscape software (v. 3.8.0) [54].

Statistical analysis
All data were assessed as the "means ± SD". Student's ttest was performed and P < 0.05 was considered statistically significant.