Comparative Transcriptome Analysis of the Ovary and Testis in Noble Scallop (Chlamys nobilis)

1Key Laboratory of Aquatic Product Processing, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, PR China 2Guangxi Key Laboratory of Beibu Gulf Marine Biodiversity Conservation, Beibu Gulf University, Qinzhou, PR China Article Information Received 26 January 2019 Revised 12 June 2019 Accepted 04 March 2020 Available online 18 June 2020


INTRODUCTION
S ex is one of the most fundamental features of life.
In recent years, next generation sequencing technology has been widely used in genome and transcriptome analyses with low cost, high efficiency, and superior accuracy, with a growing number of reproduction-related genes found from such studies (Matsumoto et al., 2013;Zhang et al., 2014Zhang et al., , 2019. More than 40 gene models were identified with high accuracy to encode reproduction-related genes reported for P. fucata and other molluscs (Matsumoto et al., 2013). Sex-related genes and pathways have also been identified in the transcriptomes of Crassostrea hongkongensis and Patinopecten yessoensis (Tong et al., 2015;Li et al., 2016b). In the Pacific oyster, certain genes, including SoxH, Foxl2, and Dsx, are thought to be involved in sex determination (Zhang et al., 2014). In P. yessoensis, FoxL2, Dmrt, and soxH are also considered to be key candidates for scallop sex determination/differentiation (Li et al., 2016b). Compared with model species, however, studies on bivalve sex determination genes and pathways are still limited.
The noble scallop (Chlamys nobilis) is an economically important and edible marine bivalve, which has been cultured in the Southern Sea of China since the 1980s (Zheng et al., 2012a). Recent research on C. nobilis O n l i n e

F i r s t A r t i c l e
has focused on genes involved in immunity (Yang et al., 2016;Zhang et al., 2016a) and reproduction (Zheng et al., 2012b;Shi et al., 2014) and on transcriptome sequencing for identification of candidate genes associated with carotenoid-based coloration (Liu et al., 2015). In this study, we employed Illumina sequencing technology and de novo assembly to obtain the transcriptome of the maturing ovaries and testes of C. nobilis. Many genes associated with sex determination and maturation were determined, and various differentially expressed genes (DEGs) between the ovaries and testes were identified. In addition, GO and KEGG enrichment analyses of DEGs demonstrated significant differences between the ovaries and testes. This study will be useful to clarify the molecular mechanisms of gonadal maturation. in C. nobilis.

Tissue collection
Eight healthy specimens of C. nobilis (N = 4 for each sex), averaging 51 mm in shell height, were purchased from an aquaculture farm in Shenzhen, Guangdong Province, China. The ovaries and testes were dissected and freezed in liquid nitrogen and stored at -80°C. A 5-mm thick section of gonad from each individual was cut and then fixed in 4% paraformaldehyde for 24 h.

Sample histology
Histological methodology followed that of Yan et al. (2010). After fixation in 4% paraformaldehyde, the gonad sections were dehydrated in graduated ethanol washes, and embedded in paraffin. Sections (6 µm thick) were prepared, and then stained with hematoxylin and eosin according to routine histological techniques. The sections were examined to assess the stage of gonadal development using a Nikon Eclipse CI microscope equipped with a Nikon DS-U3 image analyzing system (Nikon, Japan).

RNA extraction and cDNA library construction
Total RNA was isolated from each sample using Trizol Reagent (Invitrogen, Carlsbad, USA) following the manufacturer's protocols. Quantity and integrity of the RNA samples were comfirmed by 1% agarose gel and a 2100-Bioanalyzer (Agilent Technologies, USA).1μg of RNA from each ovary or testis were pooled, respectively.
Then cDNA was synthesized using the mRNA fragments as templates as usual and was sequenced by the oebiotech (Shanghai, China) using the Illumina HiSeq TM 2500 sequencing platform (Illumina, Inc, USA).

Sequence data analysis and assembly
Raw reads were processed by the NGS QC Toolkit (Patel et al., 2012). Low quality bases which the percentage of low quality bases (base quality≤20) and reads containing poly-N larger than 35bp were removed to obtain clean reads. De novo assembly of the clean reads was performed using the short reads assembling program Trinity (Grabherr et al., 2011). The TIGR Gene Indices Clustering (TGICL) tool was used to remove redundant sequences and perform further assembly (Pertea et al., 2003).

Sequence annotation
Unigenes were aligned with protein databases NR, Swiss-Prot, and KEGG using BLASTx alignment (e value < 10 -5 ). Gene names were assigned to each assembled sequence based on the best BLAST hit (highest score).
To annotate the assembled sequences with GO terms describing biological processes, molecular functions, and cellular components, the Nr BLAST results were imported into Blast2GO (Conesa et al., 2005;Götz et al., 2008). WEGO was used for GO functional classification of all unigenes (Ye et al., 2006). Additionally, pathway assignments were carried out based on the KEGG database.

Sequence mapping and differential expression analysis
Sequence reads were mapped to reference sequences using bowtie2 (Langmead and Salzberg, 2012) and eXpress. Expression levels for each unigene were measured with FPKM (Fragments Per Kb per Million Fragments) and input into the DEseq R package for differential expression analysis (Anders and Huber, 2013). Only genes with a false discovery rate (FDR) ≤ 0.05 were and log 2 (fold change) values≥3 were selected as the thresholds, to identify differentially expressed genes (DEGs) between the testis (control) and ovary. GO and KEGG enrichment analyses of DEGs were performed using R based on hypergeometric distribution.

Quantitative RT-PCR validation
To validate the Illumina sequencing data, 11 DEGs between the ovary and testis were chosen for quantitative real-time PCR (qRT-PCR). All primers used were manufactured by Invitrogen (Shanghai, China) ( Table I). The RNA samples used for qRT-PCR amplifications were the same as those used to construct the RNA-Seq library mentioned above. For each sample, quantification was performed using gDNA Eraser and the Prime-Script™ Reverse Transcriptase Reagent Kit with gDNA Eraser (TaKaRa, Japan). The β-actin of C. nobilis was used as an internal reference. The qRT-PCR was performed in triplicate for each sample using the CFX96 real-time PCR Detection System (Eppendorf, Germany) in a 25 μl reaction system, with the following components: 12.5 μl of 2 × SYBR Green Real-time PCR Master Mix (Takara, Japan), 0.5 μl of each primer (10 μM), 1.0 μl of cDNA, and 10.5 μl of RNase-free water. The PCR program was as follows: 95°C for 30 s, 40 cycles of 95°C for 5 s, and 60°C for 30 s. To assess the specificity of the PCR amplification, a melting curve was obtained at the end of the reaction, and a single peak was observed. Data were quantified using the 2 -ΔΔCT method. The amplification efficiencies of the target and reference genes were verified and found to be approximately equal. All qRT-PCR data were analyzed using SPSS 19.0. Differences between means were considered significant at the 95% confidence level (P < 0.05). The results were expressed as means ± standard errors.

Histological analysis of gonads
Histology of the sampled C. nobilis gonads revealed that both the female ovaries and male testes were at the maturing stage. The ovaries were filled with oocytes and the testes were filled with sperm ( Fig. 1).

Sequence analysis and assembly
As shown in Table II, a total of 92,932,520 raw sequence reads were generated (46,404,820 and 46,527,700 reads from the ovary and testis, respectively). After stringent quality checking and data clean-up, the two sequence datasets consisted of clean reads only. The number of clean read is same as that of raw read, but with a different number of bases. The valid data rate was 99.99%, which proved that high-quality bases were obtain Comparative Transcriptome Analysis of the Ovary and Testis O n l i n e  during sequencing. Detailed sequencing and assembly results are shown in Table III. All reads were deposited in the Short Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) with the accession number SRX3049257 and SRX3055267. A total of 73,812 unigenes were assembled from the ovary and testis transcriptomes (Table III). The N50 length was 1,806 bp and the average length was 1,072 bp. The number of contigs above 500 bp was 41,944, accounting for 57%. In addition, 21,830 unigenes (30%) were greater than 1,000 bp. The assembly program produced a substantial number of highquality long sequences.

Functional annotation
For validation and annotation of assembled unigenes, BLASTx alignment between unigenes and protein databases, including Nr, Swiss-Prot and KEGG, was conducted, with an E value threshold of 10 -5 (Supplementary Table S1). The best alignment results were used to annotate protein function and determine the sequence direction of the unigenes. The results indicated O n l i n e

F i r s t A r t i c l e
F. Sigang et al.
that 23,535 unigenes (31.89%) were annotated to known proteins in the nr database. Furthermore, 59.95% of the unigenes were matched with C. gigas. In addition, 17,220 unigenes (23.33%) were similar to known proteins in the Swiss-Prot database. We found that 15,714 annotated unigenes (66.77%) were assigned to one or more sub-categories of GO terms. Among the 160,580 GO terms,76,442 (47.60%),63,203 (39.36%), and 20,935 (13.04%) were involved in biological progresses, cellular components, and molecular functions, respectively. For biological processes, genes involved in cellular processes (GO: 0009987) and single-organism processes (GO:0044699) were highly represented. Regarding molecular functions, binding (GO: 0005488) was the most represented GO term, followed by catalytic activity (GO: 0003824). Cells (GO: 0005623) and cell parts (GO: 0044464) were the most represented categories for cellular components.
Unigene KEGG pathway analysis was carried out as an alternative approach for functional categorization and annotation. In total, 6,483 unigenes were divided into six major categories covering metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, and human diseases. Among these annotated unigenes, 4,521 were classified into metabolism, including carbohydrate metabolism (808 sequences) and amino acid metabolism (686 sequences); 1,910 were grouped into genetic information processing, including translation (783 sequences), folding, sorting, and degradation (640 sequences), transcription (236 sequences), and replication and repair (251 sequences); 3,491 were assigned to environmental information processing; 2,208 were assigned to cellular processes; and 5,823 were classified into organismal systems.

Differential expression analysis
Differential expression analysis revealed that 2,842 sex-biased unigenes (including up-regulated genes and specifically expressed unigenes) were significantly differentially expressed between the ovaries and testes (FDR ≤ 0.05, Supplementary Table S2). Among them, 591 unigenes exhibited biased expression in the ovary and 173 unigenes were expressed specifically in the ovary. A total of 2,251 testis-biased unigenes were identified from the transcriptome and 965 unigenes were exclusively expressed in the testis. Due to the lack of genomic information for C. nobilis, a large fraction of DEGs (63.69%) could not be annotated, although they might contain novel genes important for gonadal differentiation and development. However, some well-known reproduction-related DEGs were detected. For example, dmrt2 and TSSK1 were isolated in testis-biased unigenes, and HTR, Fox2L, nanos,

O n l i n e F i r s t A r t i c l e
Comparative Transcriptome Analysis of the Ovary and Testis vitellogenin, and sox2 were detected in ovary-biased unigenes (Supplementary Table S2). Tables IV and V show the top 10 up-regulated and specifically expressed genes in the testis and ovary of C. nobilis, respectively. We detected 600 DEGs in several GO terms. Among them, 369 were detected in 1,195 testis-biased terms and 231 were detected in 964 ovary-biased terms (Supplementary Tables S3, S4). For the ovary-biased genes, GO terms related to protein glycosylation, fatty acid biosynthetic processes, hydrolase activity, Golgi cisternae membrane, and cell cortex were significantly enriched (Fig. 2a, Supplementary Table S4). Most testis-biased GO enrichment terms were related to male organ formation and spermiogenesis, such as fertilization, male genitalia morphogenesis, spermiogenesis, motile cilium, axoneme, cGMP activity, cAMP binding, and ATPase activity (Fig.  2b, Supplementary Table S4).
KEGG pathways related to metabolism of fatty acid, glycan amino acid, hormones, and mineral absorption were enriched in the ovary-biased genes (Fig. 3a), whereas those related to meiosis, insulin resistance, glycosphingolipid biosynthesis, and adherens junction were enriched in the testis-biased genes (Fig. 3b).

Quantitative RT-PCR validation
To validate the expression profiles obtained from Illumina sequencing analysis, 11 genes were chosen for quantitative RT-PCR. Of these, seven matched the Illumina sequencing results (Table VI). Although the other four genes did not perfectly match the sequencing data, they still showed the same differential expression tendencies, as revealed by RNA-Seq. In general, the qRT-PCR results confirmed that the Illumina data were credible.

DISCUSSION
In this study, the transcriptome of the C. nobilis testis and ovary in the maturing stage were sequenced by RNA-Seq technology. Based on the transcriptome data, genes involved in germline development, gonad development, oocyte maturation, and fertilization were identified. Vasa, nanos, and piwi play important roles in germline development. Vasa is a key determinant in germline formation and in gonad differentiation in eukaryotes, and can be used to track germ cell specification, migration, and differentiation (Gallardo et al., 2007;Wessel et al., 2014;Xu et al., 2014). Nanos plays an important role in primordial germ cell development (Tsuda et al., 2003), whereas piwi deficiency can result in complete depletion of germline stem cells and sterility in both males and females (Lin and Spradling, 1997). Vitellogenin (Vtg), vitellogenin receptor (Vtgr), and estrogen receptor are involved in gonad development. In most oviparous animals, Vtg is a large multidomain apolipoprotein that serves as a precursor of the major egg yolk protein vitellin (Matsumoto et al., 2013). In Argopecten purpuratus, vitellogenin mRNA expression is specifically expressed in the mature ovary (Boutet et al., 2008), which is in accordance with our results (Tables IV and VI). In addition, VgR is responsible for binding circulating Vg and transporting it into oocytes through endocytosis (Tufail and Takeda, 2009). Estrogen receptor regulates the transactivation of estrogen target genes, such as teleost Vtg (Matsumoto et al., 2013). HTR is involved in the induction of oocyte maturation . 17-beta-hydroxysteroid dehydrogenases (17-beta-HSDs) are important enzyme in sex steroid metabolism (Moeller and Adamski, 2009). Some 17-beta-HSDs including 17-beta-HSD6, 17-beta-HSD8, 17-beta-HSD10, 17-beta-HSD11, 17-beta-HSD12 and 17-beta-HSDs14 were found in this study. The substrate of these HSDs are androgens and estrogens. Zona pellucida spermbinding protein is useful in fertilization (Aagaard et al., 2010).
In this study, 591 unigenes exhibited biased expression in the ovary, of which 173 were expressed specifically in the ovary. A total of 2,251 testis-biased unigenes were identified from the transcriptome, with 965 unigenes exclusively expressed in the testis. Our results showing more testis-biased than ovary-biased unigenes agree with previous findings for P. yessoensis (Li et al., 2016b) and P. olivaceus , but differ from that of C. gigas, in which more genes were found in the ovary (Zhang et al., 2014). Genes that exhibited the greatest differential or specific expression in the gonads are listed in Tables IV and V, respectively. Potassium voltage-gated channel protein was highly expressed in the testis, which is the first time this has been found in a mollusc. In mammals, potassium voltage-gated ion channel is present in the testis and spermatozoa (Jacob et al., 2000). K + channel activation is a primary event leading to acrosome reactions in mammalian spermatozoa (Benoff, 2009). TSSK1 is one member of Tssk family. The serine/threonine protein kinase catalytic (S-TKc) domain of TSSK1 is highly conserved in animals. TSSK1 is expressed when spermatids differentiate into morphologically mature spermatozoa during spermiogenesis, which highlights its important role in normal male reproduction (Xu et al., 2008). TSSK1 is only transcribed in the male gonads of Atrina pectinata and is highest in the mature testis , as found in our results (Table IV).
Nutrients and energy reserves are key factors in supporting oocyte development. Glycogens, lipids, and proteins play important roles in the reproductive processes of many bivalves. Glycogens are an important energy source for gonad development in clam (Li et al., 2011;Yan et al., 2010) and oyster (Zeng et al., 2010). Lipids and proteins are accumulated as vitellin, which is a major organic component of oocytes in female gonads. In male gonads, lipids and proteins are energy sources during spermatogenesis after glycogen breakdown (Li et al., 2011;Bi et al., 2016). These above results were further shown in the transcriptome level findings in our study. GO terms and KEGG pathways related to protein glycosylation, fatty acid biosynthetic processes, and hydrolase activity were enriched in the C. nobilis ovary (Figs. 2 and 3). Our results are agreement with previous transcriptome research in ovary. For example, in the Haliotis rufescens ovary, a high percentage of transcripts are associated with metabolism and catalytic and enzymatic activity (Valenzuela-Muñoz et al., 2014); In P. yessoensis, ovary-biased genes are enriched in the glycan-related pathway, which suggests active glycan biosynthesis and metabolism (Li et al., 2016b); Here, the AMPK signaling pathway was identified in the C. nobilis ovary, which was reported in a mollusc ovary for the first time. The AMPK pathway is involved in controlling cellular energy homeostasis and, at the whole animal level, in regulating energy balance and food intake (Proszkowiec-Weglarz et al., 2016). Furthermore, AMPK plays a vital role in ovary development through mediating glucose, insulin-like growth factor-1(IGF-1), follicle stimulating hormone (FSH), and adiponectin (Ratchford et al., 2007;Tosca et al., 2007;Kayampilly et al., 2009;Lu et al., 2008;Chen et al., 2006).
In the C. nobilis testis, GO terms were enriched in male organ formation and spermiogenesis. Some KEGG pathways were similar to those of other molluscs. For example, meiosis is significantly enriched in C. gigas and P. yessoensis testes (Zhang et al., 2014;Li et al., 2016b), and the insulin signaling pathway has been identified in C. hongkongensis (Tong et al., 2015). The glycosphingolipid biosynthesis pathway was first identified in a mollusc testis in this study. Glycosphingolipids are ubiquitous molecules composed of a lipid and carbohydrate moiety. Glycosphingolipids have been isolated and identified in other animal sperm, including that of humans and sea urchins (Ritter, 1987;Ijuin et al., 1996), and are critical for sperm activation in C. elegans (Dou et al., 2012). Further studies on DEGs associated with GO terms and pathways are needed to reveal the different molecular mechanisms that exist between ovarian and testicular maturation processes.