Skin transcriptome analysis of the genes involved in mucous immunity and discovery of SSRs and SNPs in Crucian carp Carassius auratus

Fish skin is first line of defence against attachment and penetration of pathogens. Crucian carp (Carassius auratus) is one of the major freshwater species and important food fish in China, yet its molecular mechanism of skin immune response remains unclear. In this study, a de novo transcriptome assembly of crucian carp skin was performed by using Illumina Hiseq 2000 platform. A total of 49,154,776 unigenes were assembled, among which 60,824 (25.16%) unigenes were annotated against the NCBI database. Gene ontology and KEGG mapping assigned the terms to 37,103 (39.15%), 9,337 (23.23%) unigenes, repectively. The best represented KEGG categories were immune system (8,871, 20.50%) and signal transduction (7,805, 18.04%), and several mucin proteins (MUC1, MUC2, MUC5B, MUC5AC, MUC18, etc.) as the main membrane-bound and secreted components of the mucosal layer, implying the differentiation of immune responses in the crucian carp skin. Moreover, the results revealed that potential 28,928 SSRs and 249,964 SNPs were detected in crucian carp. This is the first report on transcriptome analysis in the skin of C. auratus, which contribute to understanding the molecular mechanisms of mucous immunogenectics response and epidermal mucus secretion of the skin in fish.

. Evidence is accumulating that skin is a peripheral immune organ in teleosts [2,13,14]. However, there is a lack of genomic information such as the skin transcriptome of crucian carp.
Transcriptomic tools are commonly used to measure the expression profiling of immune genes and to find new immune genes in fish [15,16]. The annotation was lower than those reported in high throughput sequencing studies conducted in other fish species, such as turbot (44.84%) [17] and mud loach (43.76%) [18], but comparable to that in crucian carp (17.44%) [5]. Fish skin plays various vital functions especially in immunity and defense against invading pathogens and environmental stressors [19,20]. Pooling of RNA samples from multiple individuals followed by transcriptome analysis using 454 sequencing provides an excellent opportunity to generate large numbers of SNP markers [21]. Analysis of skin transcriptome of Sea trout (Salmo truttam) , Japanese flounder (Paralichthys olivaceus) and Mud loach (Misgurnus anguillicaudatus) showed putative genes involved in immunity and epidermal mucus secretion, suggesting the complexity of immune mechanisms in fish skin [12,22,18]. However, transcriptomic analysis of crucian carp skin remains to be performed.
In this study, we assembled and characterized the skin transcriptome of crucian carp using the Illumina HiSeq 2000 system analysis. A large of genes involved in immune reactions and mucus secretion were identified. Moreover, this study is the first to report the characterization of crucian carp skin associated with innate immunity and molecular markers, the results will help to develop database for crucian carp and lay a solid scientific foundation for functional genomics in fish.

De novo assembly and transcriptome annotation of the skin of crucian carp
In order to characterize the skin transcriptome of crucian carp, total RNA samples isolated from the skin of crucian carp were subjected to library construction and high-throughput sequencing using the Illumina Illumina Hiseq 2000 platform. Taken together, a total of 4,915,776 unigenes were finally assembled from the filtered short reads. The total number, mean length, N50 value of assembled unigenes were 129,797, 649 and 1,407 bp, respectively (Table 1).
To further understand the crucian carp skin transcriptome, the assembled unigenes were aligned to 4 the NCBI nr database. In this study, our results that all the assembled 60,824 (25.16%) unigenes were annotated by BLASTX searches against the NCBI non-redundant protein database. The similarity distributions of best blast hits are displayed in ( Figure 1A). The E-value distribution of the top hits indicated that 15.35% contigs showed excellent matches (E-value=1e-30 ~ 1e-15) and that 19.63% had highly and moderately significant homology (E-value= 0, Fig 1B). The species distributions of best blast hits are shown in Figure 1C. Most of the matched contigs had hits in the Cyprinidae family: Danio rerio Brachidanio rerio, 79.54% and Carp Cyprinus carpio, 2.25% . Moreover, 18.21% exhibited similarity to other organisms, mainly eukaryotes: fishes, mammals and birds.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and GenomesKEGGpathway analysis
Gene Ontology (GO) category enrichment analysis showed that cellular process, metabolic process, biological regulation and response to stimulus were the four most abundant GO term in crucian carp skin, as shown in Figure 2. A total of 337, 407 GO terms were assigned to 37,103 unigenes (39.15% of the total assembled sequences). The numbers of mapped GO terms for biological process, molecular function and cellular component were (172,651, 51.17%), (47,757,14.15%) and (116,999, 34.68%), respectively. Within the biological process category, contigs involved in cellular process (27,381,15.86%) and single-organism process (22,494,13.03%) were highly represented. Binding (24,504,51.31%) and catalytic activity (13906, 8.18%) were dominant groups within the molecular function category. In the cellular component category, cell (26,447,22.60%) and cell part (26,446, 22.60%) were the most represented subcategories.
The annotated unigenes were further grouped using the Cluster of Orthologous Groups (COG) database. As shown in Fig. 3, 41,200 unigenes were classified into 25 functional families. The subfamily was translation, ribosomal structure and biogenesis (3,011, 7.34%), followed by signal transduction mechanisms (2,631, 6.39%), and cell cycle control, cell division, chromosome partitioning (2,619, 6.36%). A number of unigenes (124, 0.30%) were assigned to defense mechanisms that might be closely related to the crucian carp immune defense.
With the Kyoto Encyclopedia of Genes and Genomes KEGG database, we can further analyzed the genes-related signaling pathway, it showed that a total of 43,269(17.90%) of the unigenes were assigned to 259 pathways, and found the several significant immune-related pathways. Within the top five KO categories including metabolism, genetic information processing, environmental information processing, cellular process and organism system, the subcategory immune system (8,871, 20.50%) was the most highly represented, followed by signal transduction (7,805, 18.04%), nervous system (4,708, 10.88%), cellular community (4,261,9.85%), transport and catabolism (3,776,8.73%), endocrine system (3,324,7.68%), signaling molecules and interaction (3,234, 7.47% ; Table 2 ).

Epidermal mucus secretion
Based on the top BLAST hit descriptions, contigs annotated as mucin genes were extracted from skin crucian carp transcriptomes and used for further analysis. The BLASTX results identified contigs as showing significant homology to fish mucin genes in Table 3

Immune-related pathways
The immune responses in the skin of crucian carp have not been characterized. To obtain a better overview of the immune system in crucian carp, we further identified immune-relevant unigenes from the transcriptome. All the immune pathways annotated in the crucian carp skin transcriptome are listed in Table 4. KEGG annotation identified 259 genes related to the immune system: these genes 6 were distributed in 15 pathways. The dominant pathways included chemokine signaling, leukocyte transendothelial migration and Fc gamma R-mediated phagocytosis. The highest numbers of retrieved genes in each pathway were found in the RIG-I-like receptor signaling pathway (95.83%), antigen processing and presentation (88.10%) and T cell receptor signaling pathway (78.87%) contained the highest ratios of identified genes versus the total number of known genes in the reference pathway.
Thus, these results have provided an overview of the pathways involved in the immune functions of crucian carp skin.

Discussion
The innate immune system is the only known defense weapon of invertebrates and a fundamental defense mechanism of fish [26]. The important defensive role of the skin in the innate immune system is well known and has been studied in several fish species [27-29, 18, 30]. Moreover, crucian carp, mainly omnivorous fish that feed on plants, preferring to live in groups and choosing food, and has long been used for food and medical purposes in eastern Asia. Therefore, characterization of its transcriptome will be of great value for the breeding, cultivation and disease prevention and control of this species.
In recent years, the transcriptome analysis via next-generation sequencing technology has brought 7 new insight into the knowledge of whole transcriptomes in many organisms [31,32,5,[33][34][35]. The immune functions of fish skin have attracted intensive interests of the research community, and a large number of antimicrobial and bioactive substances have been identified in the skin mucus of fish [36][37][38]. In this present study, we sequenced and analyzed the skin transcriptome of crucian carp For biological processes, most of these transcripts were related to cellular process and metabolic process, which was in agreement with most of other studies [35,39]. The activities of these biological processes perhaps the basis of the quick protein biosynthesis and secreting ability of fish epidermis.
Fish skin has vital biological functions including chemical and physical protection, sensory activity, behavioral purposes, thermoregulation, hormone metabolism, maintenance of fluid balance and osmotic homeostasis [11,40]. Energy metabolism and cellular differentiation were the two major subclasses of metabolism analysis. Energy metabolism related pathways such as oxoacid metabolic process and carboxylic acid metabolic process were obviously detected after 2 h duration stress. In the previous ischemic/hypoxic brain studies, energy metabolism was considered to play a key role in protecting the organs from the consequences of energy deprivation [41]. MAP kinases (MAPKs) play significant roles in the immune system by regulating key cellular events including cell migration and phagocytosis [42,43]. And a broad range of functions in energy metabolism were enriched in KEGG 8 pathway, such as arginine and proline metabolism, fructose and mannose metabolism. In a summery, the transcriptome profile analysis would provide valuable information for future studies on development changes and broad stress response in teleosts.
Fish possess numerous distinct and complex defense mechanisms to protect themselves from these pathogenic infections amongst which fish skin mucus acts as the first line of physical defense against pathogens [44]. It has been studied in several fish species [36,28,17]. Recently, the skin transcriptome of Atlantic salmon (Salmo salar), Mud Loach (Misgurnus anguillicaudatus) and Sea trout (Salmo trutta m. trutta) were recently assembled and several mucin genes were identified [29,18,12].
Mucins are high molecular weight glycoproteins and are the main constituents of fish skin mucus, which can produce a protective gel to bind a range of bacteria and which constitutes an important part of the mucosal defense against infection [45].
Although immune-related genes in Japanese flounder have been previously characterized from ESTs or EST-based microarray chips [47,48], studies on the immune system of this species are limited due to the lack of transcriptomic and genomic resources. Analysis of skin transcriptome of Sea trout (Salmo truttam) and Mud loach (Misgurnus anguillicaudatus) showed putative genes involved in immunity and epidermal mucus secretion, suggesting the complexity of immune mechanisms in fish skin [12,18]. Although immune genes and pathways in fish tissues such as gill, liver, spleen, head kidney and larvae of turbot [17] and head kidney of grass carp [49] have been previously characterized using RNA-seq, our results that T cell receptor signaling pathway (78.87%) and B cell receptor signaling pathway (35.85%). B lymphocytes and T lymphocytes participate in specific antigen defense. B cells are involved in adaptive humoral immunity through antibody production, antigen presentation, and memory B cells development after antigen-mediated activation. B cell activation is achieved through the binding of antigen to the B cell receptor (BCR) located on the outer surface of B cells [50]. T cells are involved in cell-mediated immunity through phagocyte activation, antigen-specific cytotoxic T-lymphocytes, and the release of various cytokines in response to an antigen. T cell activation is achieved when the T cell receptor (TCR) recognizes antigens presented by major histocompatibility complex (MHC) molecules [51]. In this transcriptome study, 38 unigenes were mapped to the B cell signaling pathway that consists of 106 known genes, 56 unigenes were mapped to the T cell signaling pathway that consists of 71 known genes. Consequently, B cell receptor and T cell receptor signaling pathway members detected in the crucian carp transcriptome will enable investigations into the mechanisms in this pathway.

Conclusions
This study investigated the skin transcriptome of the crucian carp using the Illumina Hiseq 2000 platform. A total of 49,154,776 unigenes were finally assembled, 60,824 (25.16%) of which were annotated by BLAST searches. A large number of contigs were classified according to GO and KEGG terms, which represent multiple signaling pathways and processes. These data provide a rich source to discover and identify immune-relevant genes, and the new mucin sequences (eg., MUC5AC, MUC2, MUC18) identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the crucian carp.

Fish
Crucian carp (average weight 50±5g, body length 13±2cm) were purchased from a local fish farm (Xinxiang, China). All procedures involving the handling and treatment of fish used during this study were approved by the Henan normal university institutional animal care and use committee (HNU-IACUC) prior to initiation. Fish were anesthetized by immersion in a bath containing 10 mg/L benzocaine. Fish were acclimatized at 20±2 ℃ in 45 L tanks with aerated fresh water for 15 days.
Fish were fed with commercial dry feed distributed manually twice a day. After 2 weeks of acclimation, fish were randomly sampled to detect pathogenic bacteria from the liver, spleen and kidney. Thirty healthy fish were randomly used for the experiments and the samples from dorsal skin around the central part of the body were collected with sterile scissors. Skin samples from three fish were pooled and immediately frozen in liquid nitrogen and then stored at -80 ℃ for total RNA extraction.

RNA extraction, RNA-Seq sample preparation
Total RNA was extracted from the skin samples using Trizol reagent (Invitrogen) and was further purified using an RNeasy Mini kit (Qiagen) according to the manufacturer's instructions. The quantity, purity and integrity of RNA were measured on a Nanodrop-2000 spectrophotometer. Samples with higher quality (absorbance ratios at 260 nm/280 nm>1.9) were selected for high-throughput sequencing. Crucian carp skin HiSeq sequencing library was constructed following the manufacturer's protocol. By using an Illumina Hiseq 2000 platform with instruments in pairs, and making the usual 90 bp readings.

De novo assembly, comparative analysis, and functional annotation of the transcriptome
Transcriptome sequencing was conducted by using an Illumina Hiseq 2000 sequencing platform. To better assemble the entire transcriptome de novo, a paired-end (PE) sequencing strategy was used.
Transcriptome de novo assembly is carried out with short reads assembling program Trinity package.
The raw data generated from Illumina sequencing were first quality-filtered to eliminate adaptor sequences, low-quality bases (Q < 20) and unpaired reads [23]. The longest sequences in each cluster were reserved and designated as unigenes. The assembled unigenes of the crucian carp was compared to the unique nucleotide sequences of crucian carp deposited in the NCBI databases using the BLASTn algorithm. The Kyoto Encyclopedia of Genes and Genomes KEGG pathway and COG annotations were performed using blastx search against the Nr, Swiss-Prot KEGG and COG databases.
Sequences were checked and confirmed for homologs in the GenBank nr database [46] using the program BLASTX and BLASTN (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The outputs of blast searching against the NCBI nr protein database were imported into Blast2GO program [24] for GO term mapping. Multiple sequence alignments were carried out by using the Genedoc program, and phylogenetic analysis was performed and analyzed by using the MEGA 5.1 software.

Identification of SSR and SNP
MicroSAtellite (http://pgrc.ipk-gatersleben.de/misa/) was used to identify putative SSRs in the unigenes from the assembled transcirpt. The parameters set to ≥10 repeat units for mononucleotide SSRs, ≥6 repeat units for dinucleotide, and ≥5 repeat units for trinucleotide, tetranucleotide pentanucleotide, and hexanucleotide SSRs [5]. The compound repeats which composed of two or more microsatellite sequences separated by 100 bases were identified, respectively. To identify putative SNP in the transcriptome of Japanese flounder, quality SNP was used to identify candidate SNP markers [25].  Gene ontology (GO) classification for the assembled unigenes in crucian carp. Results of blastx searches against the NCBI nr protein database were imported into Blast2GO software for GO term mapping and annotation. The number and ratio of sequences assigned to level 2 GO terms from sub GO categories including biological process, molecular function and cellular component were shown.
18 Figure 3 The cluster of orthologous groups (COG) classification of the unigenes from crucian carp.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. Tables.docx