A dataset of hidden small non-coding RNA in the testis of heat-stressed models revealed by Pandora-seq

Infertility, a worldwide reproductive health concern, impacts approximately one in five couples. Male infertility, stemming from spermatogenic dysfunction and reduced sperm quality, stands as a primary factor contributing to infertility. Given the global decrease in male fertility linked to environmental factors like the greenhouse effect, it is crucial to develop a comprehensive understanding of how increased temperatures impact both the quantity and quality of sperm. In this study, we utilized Pandora-seq technology to detect the small non-coding RNAs (sncRNAs) expression profile in the testicular tissue of heat-stressed mice. The investigation explores the dynamic shifts in sncRNAs within the mouse testis under heat stress, including miRNAs, tsRNAs, piRNAs, rsRNAs and other sncRNAs. Furthermore, we successfully identified differentially expressed sncRNAs in testicular tissues before and after heat stress. Subsequently, we conducted functional enrichment analysis on the potential predicted target genes of differentially expressed miRNAs and tsRNAs. These datasets will constitute a valuable foundational resource for further investigations into the decline in male reproductive capacity triggered by heat stress.


Background & Summary
Infertility, namely reproductive failure, has become the main problem of human reproductive health.It is estimated that infertility affects up to one-fifth of couples worldwide 1 .Among them, infertility caused by male factors accounts for 30-50% of infertility cases 2,3 .Given the established correlation between semen quality and successful pregnancy, the evaluation of male reproductive capacity often relies on semen quality assessments [4][5][6] .Notably, global male fertility is declining rapidly, with average sperm count and concentration dropping by approximately 50% 7 .Studies have indicated that the greenhouse effect is considered to be the current culprit for the reduction of semen quality in men 8,9 .Scrotal temperature profoundly impacts male reproductive health and semen quality.Usually, maintaining testicular temperature 2-4 °C lower than the core body temperature is crucial for optimal spermatogenesis.However, when the testis is exposed to heat stress, it accelerates germ cell apoptosis 10 .The evidence has revealed that for every 1 °C rise in temperature, there's a correlated decrease in sperm production by 14%, significantly impacting overall sperm count 11 .Hence, it is imperative to comprehend the underlying causes behind the decline in both quality and quantity of male sperm due to high temperatures.This understanding holds significant importance in averting the decline in male fertility.
In recent years, there has been a noticeable increase in research attention directed towards understanding the involvement of non-coding RNAs across diverse biological processes 12 .Non-coding RNAs include long non-coding RNAs (lncRNAs) and small non-coding RNAs (sncRNAs).Over the past decade, high-throughput RNA sequencing technology has significantly advanced the exploration of functional sncRNAs, notably miRNAs.However, when sncRNAs carry specific modifications, including 3′end modification (blocking the adapter ligation process) and RNA methylation modification (interfering with the reverse transcription process), the conversion efficiency of sncRNAs to cDNA is greatly reduced, resulting in reduced sequencing depth and accuracy 13,14 .To address these challenges, a novel RNA-Seq approach named Pandora-Seq (Panoramic RNA Display by Overcoming RNA modification aborted sequencing) has been developed, effectively circumventing the interference of RNA modifications with sequencing results 14 .The sequencing approach relies on enzymatic treatment of small RNAs within the 15-50 nucleotide range, employing dealkylase α-ketoglutarate-dependent hydroxylase (AlkB) and its mutant form 15 , as well as T4 polynucleotide kinase treatment 16,17 .This process effectively eliminates sequencing-affecting modifications, resulting in a comprehensive profile of sncRNAs (Fig. 1a).
Epigenetic changes, including alterations in sncRNAs, play a pivotal role in gametogenesis and are closely linked to various reproductive disorders 18 .Recognized as pivotal contributors to epigenetic regulation, sncRNAs have garnered considerable attention for their crucial involvement in testicular development and spermatogenesis [19][20][21] .In the context of global warming induced by the greenhouse effect, heat stress factors are impacting male semen quality and fertility 8 .Hot water bath is a common form of testicular heat stress.Typically, exposure to hot water for over 30 minutes per day can lead to specific damage in the anatomical morphology and sperm quality of the male testis within a week 22 .Previous studies predominantly delved into specific types of sncRNAs, such as microRNA 23 or piRNA 24 , to understand testicular damage induced by heat stress.However, the dynamic alterations within the intricate regulatory network of sncRNAs in the testis under heat stress remain unexplored.Therefore, to reveal the dynamic changes of the sncRNA library in the mouse testis under heat stress, we collected testicular tissues from both the normal group (NC) and hot water group (HW) mice, obtaining their Pandora sequencing data (Fig. 1b).We analyzed a total of 8 samples, with 4 biological replicates in NC and HW group.Details of the specific samples are presented in Table 1.We used the Illumina NextSeq sequencing platform to generate a total of 278676235 raw reads.The sequencing results were annotated using the small RNA annotation software SPORTS1.1, yielding a total of 99,192,693 match genome reads (Table 2).Our data comprehensively delineate the dynamic changes in small non-coding RNAs (sncR-NAs) in testicular tissue before and after heat stress in mice, including miRNAs, tsRNAs, piRNAs, rsRNAs and other sncRNAs.

HW mice preparation and sample collection.
In this study, we selected 8 ten-week-old male ICR mice (36.50 ± 1.79 g) from DOSSY Experimental Animals Co., Ltd.(Chengdu, China).To establish the testicular heat stress model, mice underwent a daily 25-minute immersion in 43 °C hot water for one week, while the control group experienced baths in 33 °C water.One week following the hot water bath treatment, eight ICR mice were euthanized, and their testicular and epididymal tissues were promptly collected and frozen in liquid nitrogen.The ethical considerations of this study were approved by the Ethics Committee of Sichuan Agricultural University.Total RNA isolation and quality control.Total RNA was extracted with TRIzol reagent according to manufacturer's instructions (Invitrogen; 15596018).Agarose gel electrophoresis was utilized to evaluate the quality of the total RNA sample, followed by the isolation of RNA of the specified size from the total RNA.Concurrently, RNA concentration was determined using a Nanodrop spectrophotometer (NanoDrop 2000, Thermo Fisher Science, USA).
Small RNA library construction and deep sequencin.The RNA segment was separated by PAGE, then a 15-to 45-nucleotide stripe was selected and recycled.The adapters were obtained from the QIAseq ® miRNA Library Kit (QIAGEN: 331505) and ligated sequentially.The amplified flow cell was sequenced on the Illumina system by epibiotek (Guangzhou Epibiotek Co., Ltd., Guangzhou, China).The raw sequencing data were generated using the SE75 mode of the NextSeq CN500 sequencing platform.
Small RNA annotation and analyses for PANDORA-seq data.The annotation of small RNA sequences was carried out using the non-coding RNA annotation and analysis software SPORTS1.1, specifically optimized for small RNAs derived from rRNA and tRNA.Reads were mapped to the following individual non-coding RNA databases sequentially: (1)   Mapping to the parent rRNA in ascending order of rRNA sequence length will substantially ensure the uniqueness of each rsRNA annotation.For tsRNA annotation, annotation was based on pre-tRNA and mature tRNA.DESeq 2 was employed to identify sncRNAs with significant differences, using screening criteria of log 2 (Fold Change) > 1 and P < 0.05.
Dynamic changes of sncRNAs expression and sequence characteristics in mouse testicular tissue before and after heat stress.Initially, we performed principal component analysis (PCA) on the read counts of miRNA, tsRNA, rsRNA, and piRNA for each sample (Fig. 2a).The PCA results highlighted significant differences in the distribution of testicular tissue samples before and after heat stress across each sncRNA family.By analyzing the read counts of diverse sncRNAs, we characterized the expression profile of sncRNAs in mouse testicular tissue before and after exposure to heat stress.The pie chart illustrates the proportion distribution of different types of sncRNAs in testicular tissue before and after heat stress treatment (Fig. 2b).The results revealed that upon the onset of testicular heat stress, the proportion of rsRNA expression increased from 42.03% to 49.10%, whereas the proportion of piRNA expression dropped from 1.75% to 0.13%.We further investigated the nucleic acid length distribution for each subtype of sncRNAs (Fig. 2c).Specifically, miRNA lengths were concentrated around 22 nt, tsRNA exhibited distinct peaks at 18 nt and 32 nt, rsRNA displayed a peak at 31 nt, and ysRNA presented a peak with relatively less clarity.Interestingly, the peak value of piRNA was evident at 29 nt or 30 nt before the induction of heat stress.However, upon the occurrence of heat stress, there was a noticeable reduction in the overall expression of piRNA, and the peak value became less defined.The sncRNA landscape of rsRNA and tsRNA before and after heat stress in mice.Taking into consideration the diverse types of rsRNA and tsRNA, we conducted a detailed analysis of the subtypes within these two categories of sncRNAs.Figure 3a illustrated the distribution of the number of rsRNAs derived from fragments of different lengths.The absolute proportion of rsRNA originating from 28 S rRNA was approximately 50% in both sample groups.Furthermore, following heat stress in mouse testis, there was a notable decrease in the proportion of rsRNA derived from 5.8 S rRNA.It is worth noting that, as previously illustrated in Fig. 2b, the proportion of rsRNA expression increased under heat stress conditions.Similarly, we analyzed the distribution of various types of tRNA derived fragments in the samples (Fig. 3b).The results indicated that heat stress treatment elevated the proportion of 5′ tsRNA in mouse testicular tissue while decreasing the proportion of 3′ CCA tsRNA.Additionally, we visualized the sequence length distribution for each type of rsRNA and tsRNA (Fig. 3c,d).We also conducted an analysis of the sources of different types of tsRNA, which include pre-tRNA, mature tRNA, and mitochondrial tRNA.Due to the exceptionally low content of pre-tRNA, it is not represented in the pie chart (Fig. 3e).Interestingly, heat stress appears to have no influence on the content of 3′ tsRNA from diverse sources in the mouse testis.However, heat stress led to a reduction in the expression of 5′ tsRNA and 3′ CCA tsRNA  derived from mature tRNA, while simultaneously increasing the expression of 5′ tsRNA and 3′ CCA tsRNA derived from mitochondrial tRNA.
Identification of differentially expressed sncRNAs.Differentially expressed sncRNA analysis was performed using the R package DESeq 2. The criteria used for screening differentially expressed sncRNAs were log 2 (Fold Change) > 1 and P value < 0.05.We visualized the differentially expressed sncRNAs based on the screening conditions.The red circle represents up-regulated sncRNAs in the HW group, while the blue circle represents down-regulated sncRNAs (Fig. 4a).Simultaneously, utilizing P values as a criterion, we identified and labeled the top five up-regulated and down-regulated sncRNAs with the most significant differences.Given that miRNA and tsRNA exert their functions relying on the 5′-end seed sequence, we conducted an analysis of the base preference of the seed sequence in miRNA and tsRNA (Fig. 4b).We observed a significant difference in the base site preference of seed sequences for differentially expressed miRNAs and tsRNAs between the two sample groups before and after heat stress.For differentially expressed miRNAs (Threshold: log 2 FC > 1.5 and P value < 0.05), we predicted their potential target genes using three databases: miRecords, miRTarBase, and TarBase (https://jingege.shinyapps.io/jingle_molecular/).A total of 3408 validated target genes were predicted, and five of these target genes were consistently displayed in all three databases (Fig. 4c).For potential target genes of miRNAs, we conducted GO and KEGG functional enrichment analysis using the DAVID website (https://david.ncifcrf.gov/summary.jsp).GO functional enrichment analysis revealed that the potential target genes of miRNAs were predominantly enriched in biological processes such as transcriptional regulation, cell survival and death, and protein function and transport (Fig. 4d).KEGG analysis demonstrated that the potential target genes of differentially expressed miRNAs, both before and after heat stress, were predominantly enriched in pathways related to signal transduction, cell survival, and metabolism (Fig. 4e).Similarly, for differentially expressed tsRNAs, we predicted their potential target genes using the online prediction website TargetScan (https://www.targetscan.org/vert_50/seedmatch.html) and conducted GO and KEGG functional enrichment analysis using the DAVID website (https://david.ncifcrf.gov/summary.jsp).GO functional enrichment analysis revealed that the potential target genes of tsRNA were predominantly enriched in biological processes such as transcriptional regulation, cell development and differentiation, and cell survival and death (Fig. 4f).KEGG analysis showed that the potential target genes of differentially expressed tsrna before and after heat stress were mainly enriched in pathways related to signal transduction, cytoskeleton and signal regulation, and cell metabolism (Fig. 4g).

Data Records
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2022), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA014959) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa 25 .

technical Validation
RNA quality control.The purity and size of the extracted RNA were assessed using agarose gel electrophoresis to ensure the integrity of the samples utilized for constructing both the small RNA library and Pandora sequencing.The detailed information for each RNA sample was presented in Table 3, and the original agarose gel electrophoresis results were provided in Fig. 5a.The results indicated the good quality of our RNA samples, and the electrophoresis conditions were deemed suitable, ensuring the reliability of subsequent Pandora sequencing samples.
Sequencing quality control.The raw sequencing data in FastQC format were obtained from the Illumina sequencing platform.Quality assessment of the sequencing data was performed using FastQC software (version 0.11.8).We created a per-base sequence quality distribution map for each sample.The y-axis represents the sequencing quality score (expressed as Q), with Q = 20% indicating an error read rate of 1%, and Q = 30% indicating an error read rate of 0.1%.The results indicated that Q > 30%, signifying good quality of the sequencing data (Fig. 5b).

Reproducibility Validation
To verify the biological repeatability of mouse testicular tissue samples before and after heat stress, we performed Pearson correlation coefficient analysis on all 8 samples.The correlation heatmap indicated a high correlation coefficient among biological replicates for both before and after heat stress samples (Fig. 6).Furthermore, principal component analysis (PCA) demonstrated distinct separation between samples before and after heat stress, with biological samples from the same treatment forming clearly clustered groups (Fig. 2a).These results affirm the high confidence and reliability of our research data.

Fig. 1
Fig. 1 Overview of the experimental process.(a) Illustrations of comparison between Pandora-seq and RNAseq.(b) Illustrations of mouse model construction, sample collection, and Pandora sequencing experimental procedures.
genomic tRNA database GtRNAdb; (4) the mitochondrial tRNA database mitotRNAdb;(5) the piRNA database, including piRBase and piRNABank; (6) the non-coding RNAs defined by Ensembl and Rfam 12.3.In the annotation of tsRNAs, both pre-tRNA and mature tRNA sequences were employed for comprehensive annotation.Mature tRNA sequences were derived from the GtRNAdb and mitotRNAdb sequences using the following procedures: (1) predicted introns were removed; (2) a CCA sequence was added to the 3′ ends of all tRNAs; and (3) a G nucleotide was added to the 5′ end of histidine tRNAs.The tsRNAs were categorized into four types based on the origin of the tRNA loci: (1) 5′ tsRNA (derived from the 5′ end of pre-/mature tRNA); (2) 3′ tsRNA (derived from the 3′ end of pre-tRNA); (3) 3′ tsRNA-CCA end (derived from the 3′ end of mature tRNA); (4) and internal tsRNAs (not derived from 3′ or 5′ loci of tRNA).For rsRNA annotation, rsRNAs derived from 4.5 S, 5 S, 5.8 S, 12 S, 16 S, 18 S, 28 S, and 45 S rRNA were distinguished based on the sequence fragment size of rRNA.

Fig. 2
Fig. 2 The expression and sequence characteristics of sncRNAs in testicular tissue of mice before and after heat stress.(a) Principal component analysis (PCA) based on the reading counts of various sncRNAs in the sample.(b) The distribution of sncRNAs types in samples before and after heat stress was expressed in proportion.(c) Length distribution of various types of sncRNAs in samples before and after heat stress.

Fig. 3
Fig. 3 The sncRNA landscape of rsRNA and tsRNA before and after heat stress in mice.(a) Expression patterns of rsRNA fragments in samples before and after heat stress.(b) Expression patterns of tsRNA types in samples before and after heat stress.(c) Distribution of sequence lengths among rsRNA fragment subtypes.(d) Distribution of sequence lengths for tsRNA types.(e) Distribution of tsRNA type sources.

Fig. 4
Fig. 4 Conserved tsRNAs in adipose tissue of pigs and mice.(a) Volcano plot of differentially expressed sncRNAs between NC and HW groups.(b) Bias of differentially expressed miRNA and tsRNA seed sequences.(c) Venn diagram of potential target genes predicted by differentially expressed miRNAs in three databases.(d) GO enrichment analysis of potential target genes of miRNAs.Each type showed the enrichment results of the top 15 of gene count.(e) KEGG analysis of potential target genes of miRNAs.Only the top 15 signaling pathways with the most significant differences were showed.(f) GO enrichment analysis of potential target genes of tsRNAs.Each type showed the enrichment results of the top 15 of gene count.(g) KEGG analysis of potential target genes of tsRNAs.Only the top 15 signaling pathways with the most significant differences were showed.

Table 1 .
the microRNA database miRBase 21; (2) the rRNA and YRNA databases assembled from the National Center for Biotechnology Information nucleotide and gene database; (3) the Basic sample collection information of mice.

Table 2 .
Summary of PANDORA-Seq Reads.Raw reads: Original sequencing readings without quality filtering and clean.Clean reads: SPORTS 1.1 outputs clean reads by removing the sequence adapter, discarding reads obtained by sequences that are longer than the defined range and non-ATUCG sequences.Match Genome Reads: Reads number of raw reads aligned with the reference genome readings.miRNAs match reads: Number of reads aligned to miRNAs sequences.rsRNAs match reads: Number of reads aligned to rRNAs sequences.tsRNAs match reads: Number of reads aligned to mature and precursor tRNAs.mt-tsRNAs match reads: Number of reads aligned to mature and precursor tRNAs in mitochondria.piRNAs match reads: Number of reads aligned to PIWI-interacting RNAs sequences.

Table 3 .
Sample RNA quality test results.