Multi-organ transcriptomic profiles and gene-regulation network underlying vibriosis resistance in tongue sole

Vibrio spp. are major pathogens responsible for mortality and disease in various marine aquaculture organisms. Effective disease control and genetic breeding strategies rely heavily on understanding host vibriosis resistance mechanisms. The Chinese tongue sole (Cynoglossus semilaevis) is economically vital but suffers from substantial mortalities due to vibriosis. Through continuous selective breeding, we have successfully obtained vibriosis-resistant families of this species. In this study, we conducted RNA-seq analysis on three organs, including liver, spleen and intestine from selected resistant and susceptible tongue soles. Additionally, we integrated these data with our previously published RNA-seq datasets of skin and gill, enabling the construction of organ-specific transcriptional profiles and a comprehensive gene co-expression network elucidating the differences in vibriosis resistance. Furthermore, we identified 12 modules with organ-specific functional implications. Overall, our findings provide a valuable resource for investigating the molecular basis of vibriosis resistance in fish, offering insights into target genes and pathways essential for molecular selection and genetic manipulation to enhance vibriosis resistance in fish breeding programs.


Background & Summary
Currently, the world is facing the food security issue, and aquaculture plays an increasingly important role in food and nutrition supply.However, infectious diseases, especially vibriosis, occur frequently and cause huge mortality and economic loss in fish aquaculture.Vibriosis, typically caused by systemic bacterial infections from bacteria of the Vibrio genus, stands as a significant threat in marine aquaculture.Vibriosis is one of the most common bacterial diseases affecting various aquaculture marine fish.Vibriosis accounts for approximately 66.7% of reported diseases in groupers (Epinephelus spp.), impacting all growth stages and causing mortality rates of up to 50% among the fish 1 .Similarly, in the salmon farming industry, Vibrio infections represent the most serious problem to date, leading to losses exceeding USD 100 million in Norway 2 .Similarly, vibriosis has emerged as a significant pathogenic disease, resulting in high mortality rates (50%-70%) in Chinese tongue sole 3 .As in other animal production system, the success and stainability of fish aquaculture largely depends on the control of diseases 4 .A viable and efficient way to prevent the disease outbreaks lies in developing new germplasm and varieties characterized by high disease resistance and productivity.To achieve this, it is crucial to understand the molecular and genetic mechanisms underlying the disease resistance trait.
As known, the disease resistance is a highly complex system, consisting of physical barriers preventing the adhesion and invasion of pathogens, interaction between the pathogens and host, and humoral and cellular immune responses in cells 5 .Previous studies of fish regarding to diseases susceptibility/resistance have focused on identifying immune parameters, genetic architectures, and differences in gene expressions.For example, studies in Atlantic salmon (Salmo salar L.) 6 and rainbow trout (Oncorhynchus mykiss) 7 revealed that the furunculosis resistant fish had higher immune responses such as activity of serum complement and non-alpha(2) mantiprotease.Quantitative trait locus (QTL) mapping and genome-wide association study (GWAS) have identified the genetic architecture and divergent locus/genes in several fish species, such as Atlantic salmon 8 , rainbow trout 9,10 and catfish 11 At the gene expression level, the differential transcriptomic responses to pathogenic infections of the resistant and susceptible fish have been presented in specific tissue, such as in skin of Atlantic salmon upon sea lice infection 12 , in spleen of common carp (Cyprinus carpio) for CyHV-3 virus infection 13 and in intestine of the tongue sole after Shewanella algae infection 14 and Vibrio harveyi infection 15 .These studies together revealed that transcriptomic responses occur systematically in a variety of tissues and organs, and a number of genes and pathways may contribute to the fish disease resistance.However, very few studies were simultaneously carried out on the gene regulatory patterns of multiple tissues and organs, which limited an in-depth understanding of the molecular regulatory mechanisms that could explain the difference in the disease resistance.
The Chinese tongue sole is an economic-important marine fish in China.In recent years, it has suffered from over 70-90% mortality rate caused by vibriosis, caused by the infection of Vibrio pathogens such as V. harveyi and V. anguillarum.To obtain vibriosis resistant germplasms, we have conducted continuous selective breeding programs for 4-5 generations 3,16 .This practice has breed high resistant germplasms, providing a unique opportunity for study the molecular basis underlying the improvement of the vibriosis resistance in fish.In our previous work, we have demonstrated that the selected resistant and susceptible tongue soles were genetically diverged, and identified several genes that were significantly associated with the vibriosis resistance 17,18 .In addition, we compared the gene profiles between resistant and susceptible families, and identified 653 and 1421 DEGs in gill and skin, respectively 18 .
To obtain a holistic insight into the transcriptional regulation for the vibriosis resistance, here we completed the RNA sequencing of liver, spleen and intestine organs from resistant and susceptible tongue soles, respectively.Integrating our previously published transcriptome data of gill and skin tissues, we found that differentially expressed genes (DEGs) were distinct between different organs.Furthermore, weighted gene correlation network analysis (WGCNA) identified tissue-specific functional modules, pathways and genes correlated with the resistant potential against vibriosis.Integrating the DEGs and WGCNA results, our data disclosed that over 1000 genes in different organs may play roles in resisting vibriosis in tongue soles.This is in line with the results in mammals that a number of genes with large range of immune responses control host defense against foreign organisms 19 .These RNA-Seq datasets can be used to discover the plasticity in gene expressions between the resistant and susceptible fish, and thus remedy the knowledge gaps for understanding the interplay between the genetic divergence and the phenotype variation of vibriosis resistance.In addition, our data provide a valuable resource that have important implications enabling effective genetic improvement of disease resistance, which will largely be beneficial to the resistant variety breeding and disease control in aquaculture.

Methods
Ethics statement.This study was carried out in accordance with the recommendations of the Care and Use of Laboratory Animals of the Chinese Academy of Fishery Sciences (Ethics Committee number: YSFRI2019002).The protocol was approved by the Animal Care and Use Committee of the Chinese Academy of Fishery Sciences.

Selective breeding and sample preparation.
The selective breeding of the vibriosis resistant and susceptible families for C. semilaevis were performed as previously described 3 .Firstly, the genetic sex of parental fish was identified by a sex-specific AFLP marker 20 .Full-sib families were then constructed by strip spawning and were reared in separate common tanks under a flow-through system with the same rearing conditions.To trace the lineage, we tagged each family with visible implant elastomers and the pedigree information was precisely recorded 3 .
In each breeding generation, we performed artificial Vibrio challenge tests using the fish with an average size of 10-12 cm.A medial lethal dose (LD 50 ) of V. harveyi challenge was determined as previously described 3 .After intraperitoneal injection, we recorded the mortality of each family at 12 days post injection, and the families with a survival rate >80% and <30% were considered as Vibrio resistant (VR) and susceptible (VS) families, respectively.To date, the artificial challenge and selection for the resistant families processed continuously for more than 12 years, lasting for 5 generations.Fish from the VR and VS families were used for RNA collection (Fig. 1).

RNA-Seq and comparative transcriptomic analyses.
To characterize and compare the gene expression patterns, we collected the liver, spleen and intestine tissues from the resistant and susceptible fish, respectively.Total RNA was extracted from three biological replicates of each group with TRIzol (Invitrogen, USA).Pair-ended (PE) RNA-seq libraries were constructed using the Truseq mRNA stranded RNA-Seq Library Prep Kit (Illumina, USA) according to the standard protocol.Sequencing of totally 18 libraries was conducted with a Illumina HiSeq 2000 sequencing platform, generating raw reads with a read length of 2 × 100 bp and an insert size of 350 bp.The raw data was then quality filtered using RNA-QC-Chain 21 , filtering out the ambiguous N's, adaptors, low quality reads with more than 20% of the bases having a quality score < 20.Finally, we obtained 62.26-79.69 million raw reads per sample, amounting to a total of 120.35 Gb clean data (Table 1).We calculated the fragments per kilobase per million mapped sequence reads (FPKM) value for each gene using RSEM (v1.2.12) 22 , and used NOIseq 23 .to identify the DEGs, with log 2 |FoldChange| ≥ 1 with a probability ≥0.9.
The clean reads were mapped to the reference genome of the tongue sole (NCBI Accession No. GCA_000523025.1) using BWA 24 with default parameters.The average mapping rate ranged from 83.95% to 93.18%, averagely at 88.66% (Table 1).
KEGG and GO enrichment analysis.KEGG and GO enrichment analyses were performed using phyper in R software.KEGG pathways and GO terms with corrected p-values < 0.05 were considered as significantly enriched ones.GO terms are clustered into three main categories: biological processes (BP), molecular functions (MF), and cellular components (CC).

KEGG and GO enrichment
Fig. 1 The experimental workflow for resistant tongue sole selection and RNA-Seq data analysis workflows.In spleen, the DEGs were most significantly abundant in "complement and coagulation cascades" (q < 0.05) (Table 2).Interestingly, the 347 up-expressed genes were mainly involved in immune systems, such as "NOD-like receptor signaling pathway" (q < 0.05), "TNF signaling pathway", "Th1 and Th2 cell differentiation" and "IL-17 signaling pathway", while the 703 down-expressed genes were significantly enriched in a variety of metabolic pathways.The significantly enriched GO_MF terms for the down-expressed genes were "aromatic amino acid family metabolic process" and "oxidoreductase activity" (q < 0.05).
In intestine, we observed 374 over-expressed and 260 down-expressed genes in the resistant fish.The over-expressed genes were mostly abundant in "pancreatic secretion" pathway (q < 0.05) (Table 2) and GO_MF term of "serine-type endopeptidase activity" (q < 0.05).The down-expressed genes were significantly enriched in "fatty acid metabolism" "phagosome" and "apoptosis", as well as a GO term of "phospholipid catabolic process" (q < 0.05).

Weighted gene correlation network analysis (WGCNA).
To gain a holistic and comprehensive gene expression landscape, we used the total 30 RNA-seq datasets, including the newly generated 18 datasets and the previously released 12 datasets, to construct a multiple-organ gene co-expression network.
First, we filtered and normalized the genes based on (1) the FPKM value was larger than 1; (2) genes had the FPKM values in more than 16 samples; (3) adjusted the artifacts to make every gene follow an approximate normal distribution.As a result, 20512 genes were used in the normalized expression matrix.Using the mean FPKM values of these genes, we constructed a gene co-expression network and identified the gene co-expression modules with the R package 'WGCNA' 25,26 .The optimal soft thresholding power of 30 for adjacency computation was graphically determined when the degree of independence was 0.8.Subsequently, the cutreeDynamic function was used for tree pruning of the gene hierarchical clustering dendrograms, resulting in 15 co-expression modules constructed with corresponding Eigengenes (Fig. 2A).Module-sample associations were estimated using the Spearman correlation coefficient analysis between the module eigengenes and tissues type.As a result, we identified 12 modules that were strongly correlated with different tissue types, which was visualized as a heatmap (Fig. 2B).All these tissues might be involved in the determination or regulation of the disease resistance.
Genes in the salmon module had a significant correlation with the gill samples and were significantly enriched in KEGG pathways of "nitrogen metabolism" and "proximal tubule bicarbonate reclamation" (q < 0.05), as well as GO_MF terms of "ammonium transmembrane transporter activity" (q < 0.05).
The purple module had a correlation with spleen.Genes in this module were overrepresented in the digestive system, such as "pancreatic secretion", "protein digestion and absorption" and "fat digestion and absorption" (q < 0.05), as well as GO_MF terms of a series of "peptidase activity" (p < 0.05).
The cyan and green modules, which were strongly correlated with the liver tissue, were enriched in the KEGG pathways of "complement and coagulation cascades" and "phagosome" (q < 0.05).
The modules in the red and yellow color were correlated with intestine.Genes in these modules were significantly enriched for "glycosphingolipid biosynthesis", "endocrine, renin-angiotensin system" and "mineral absorption" pathways (q < 0.05).
Genes in the brown, pink, tan, black, magenta and the greenyellow modules were correlated with skin.Among them, the brown, pink, tan and black modules were enriched in a number of cellular community pathways, such as "focal adhesion", "tight junction", and "adherens junction" (q < 0.05).The greenyellow module was  mainly enriched for "cardiac muscle contraction" and signal transduction pathways, such as "calcium signaling pathway" and "cGMP-PKG signaling pathway" (q < 0.05).Overall, our results suggest that resistance to Vibrio involves a multifaceted response encompassing metabolic adaptations, immune signaling, and tissue-specific gene expression patterns.These findings have implications for understanding the molecular basis of resistance and can inform the development of breeding strategies aimed at enhancing disease resistance in aquaculture species.

Data Records
The raw data generated from Illumina HiSeq 2000 sequencing platform have been deposited in the NCBI Sequence Read Archive (SRA) database under the NCBI project (https://www.ncbi.nlm.nih.gov/bioproject/) with an accession number of PRJNA785712 27 .The gene expression data, including gene count, FPKM values and the DEGs set, have been deposited in the GEO database with an accession number of GSE270251 28 .
The gene expression matrix used for gene co-expression network inference, and gene co-expression network analysis results including genes per module are available at Figshare 29 .

Technical Validation
To validate the gene expression levels obtained by RNA-Seq method, we carried out qPCR analyses to quantify the mRNA levels of 12 randomly selected genes from different tissues.The consistent results of qPCR and RNA-seq indicated that the RNA-Seq results were reliable for further analyses (Fig. 3).

Fig. 2
Fig. 2 Multiple-tissue gene co-expression network for vibriosis resistance in the tongue sole.(A) Hierarchical clustering dendrograms of the co-expressed genes in the modules were identified by WGCNA.A total of 15 modules are colored on the left and top side.The heatmap shows the Person's association matrix among all genes in the analysis.Yellow color represents low associations and progressively darker red color represents higher associations between genes.(B) Associations between co-expressed gene modules and tissue types.Each row corresponds to a module, and each column to a sample.Each cell contains the corresponding correlation with a p-value.The table is color-coded by correlation with intensity and directions of correlations indicated on the right side of the heatmap.

Fig. 3 Q
Fig. 3 Q-PCR validation of the gene expression levels obtained from RNA-seq.VS: vibriosis susceptible fish, VR: vibriosis resistant fish.Bar height indicates the gene expression level measured by q-PCR (left Y-axis, relative expression measured by ΔΔCt method) and line graph represents RNA-seq measurements (right Y-axis, Fragments Per Kilobase Million, FPKM).X-axis shows gene names with tissues in parentheses.

Table 1 .
Summary of the RNA-Seq data for liver, spleen, and intestine from resistant and susceptible families.

Table 2 .
The enriched KEGG pathways of the DEGs soles in liver, spleen and intestine between the resistant and susceptible tongue soles.