Selection of soft thresholds
In weighted network co-expression analysis (WGCNA), in order to evaluate whether the expression patterns of any two genes were similar or not. The correlation of any two genes was usually calculated, and then the similarity of the expression pattern of any two genes was determined by the magnitude of the squared correlation. A larger correlation square value indicated that the connections between genes in the network were closer and obey the scale-free network distribution. Figure 1 shown the schematic diagram of soft threshold selection. The horizontal coordinates of both the left and right plots in the figure represented the soft threshold. The vertical axis of the left graph represented the square of the correlation coefficient in the corresponding network. The vertical axis of the right panel represented the mean value of all gene adjacency functions in the corresponding gene module. The optimal β value was the soft threshold used for the subsequent analysis. The optimal soft threshold value was 9 as can be seen in the figure.
Correlation Analysis Of Samples And Traits
The heatmap of the correlation between samples and traits was shown in Fig. 2. The samples with good correlation with both salinity and strain were X5-60. In addition to X5-60 samples, the samples with good correlation with salinity were Ctr-60 and X3-60. The samples with good correlation with strain were X5-30 and X5-60 in addition to X5-60. The difference between X5 and Ctr, X3 at the genetic level was also very obvious at different salinities. This indicated that after mutagenesis, X5 and Ctr were distinctly different at the genetic level, while X3 and Ctr are not the most distinctly different at the genetic level.
Hierarchical Clustering Analysis Of Each Module
In WGCNA analysis, a clustering tree was constructed and different modules were classified according to the correlation of expression between genes. If certain genes always had similar expression changes in a physiological process or in different tissues, then these genes may be functionally related and they can be defined as a module. In Fig. 3, each color in the lower half represented a module. In the upper half of the tree diagram, the vertical distance represented the distance between two nodes (between genes) and the horizontal distance was meaningless. From the figure, we can see that the color modules were mainly black, blue, brown, cyan, green, greenyellow, grey, turquoise and so on. According to the module from the largest to the smallest, the first four were turquoise, blue, brown and green. The turquoise module was the largest proportion of all modules, with 2117 genes in the module. The blue module had 1212 genes, followed by brown with 984 genes and green in fourth place with 599 genes. These four modules were the largest proportion of color modules among all modules.
Heatmap Analysis Of Correlation Between Samples, Sample Traits And Modules
The heatmap of the correlation between samples, sample traits and color modules was shown in Fig. 4, where the horizontal coordinates of the heatmap were the traits as well as the samples (11 in total) and the vertical coordinates were the different color modules. The number in each grid represented the strength of the correlation between the module and the sample or trait. The closer the value was to 1, the stronger the positive correlation between the module and the sample or trait. The closer it was to -1, the stronger the negative correlation between the module and the sample or trait. The number in parentheses represented the significance Pvalue, and the smaller this value was, the stronger the significance was. In Fig. 4, the modules that correlated well with salinity traits and had significant P-values were blue, red, and tan. The modules that correlated well with strain traits and had significant P-values were yellow, cyan, black, and lightcyan.
Module Gene Expression Pattern Analysis And Enrichment Pathway Analysis
After analyzing the correlation between traits and different modules, a total of three modules related to salinity were selected, among which blue and red modules were positively correlated with salinity traits, and tan module was negatively correlated with salinity traits. Among the strain-related modules, black and lightcyan were positively correlated with strain traits, and yellow and cyan were negatively correlated with strain traits. We analyzed the gene expression patterns in those modules with strong correlations.
Figure 5 shown the color modules that correlate well with salinity traits, which were blue, red and tan. In terms of gene expression, in the blue module, the expression of Ctr, X3 and X5 genes as a whole increased with increasing salinity. At a salinity of 60‰, the expression of X5 genes in this module was higher than that of Ctr and X3. The genes in this module may be the key genes for X5 to be able to tolerate high salinity. In the red module, the expression of Ctr, X3, and X5 genes as a whole were likewise all increasing with increasing salinity. At a salinity of 60‰, the expression of Ctr genes in this module was higher than that of X3 and X5. The genes in this module may be key genes for Ctr in response to high salt stress. In the tan module, the expression of Ctr, X3, and X5 genes as a whole decreased with increasing salinity. At a salinity of 60‰, the expression of Ctr genes in this module was lower than that of X3 and X5. The genes in this module may also be part of the key genes of Ctr in response to high salt stress. From the above analysis, it can be seen that with increasing salinity, these three module genes changed uniformly in all three strains, only the degree of change varied in different strains.
Figure 6 shown the modules that correlated well with the traits of the strains, which were black, cyan, lightcyan, and yellow. In terms of gene expression, in the black module, the expression of Ctr, X3, and X5 genes as a whole increased with increasing salinity. At salinities of 30‰ and 60‰, X5 gene expression in this module was higher than Ctr and X3. In the cyan module, the gene expression of Ctr, X3, and X5 as a whole decreased with increasing salinity. At salinities of 30‰, 45‰ and 60‰, the gene expression of X5 in this module was lower than that of Ctr and X3. In the lightcyan and yellow modules, there was no uniformity in the gene expression of Ctr, X3 and X5 as a whole with increasing salinity. The gene expression of X5 in the lightcyan module was higher than that of Ctr and X3 at salinities of 45‰ and 60‰. The gene expression of X5 in the yellow module was higher than that of Ctr and X3 at salinities of 30‰, 45‰ and 60‰. From the above analysis, we can see that the gene expression of the X5 strain in these modules was different from that of Ctr and X3.
Figure 7 shown the top 20 metabolic pathways of the modules blue, red and tan after KEGG enrichment. Among the top 20 enriched pathways in the blue module, the most significant enrichment pathways were proteasome, steroid biosynthesis, ribosome, ABC transport, etc. In descending order of q value, among which only the first enriched pathway was significantly different. From the number of genes in the enriched pathways, the number of genes in the synthesis of secondary metabolites was the highest. Among the top 20 enrichment pathways of the genes in the red module after KEGG enrichment, the most significant enrichment pathways were carbon fixation of photosynthetic organisms, synthesis of secondary metabolites, and amino acid biosynthesis, etc., among which carbon fixation of photosynthetic organisms was the most significant enrichment pathway. Among the top 20 enrichment pathways after KEGG enrichment in module tan, the most significant enrichment pathways were sulfur metabolism, RNA transport, and basic transcription factors, among which there was no significant difference in enrichment pathways.
Figure 8 shown the top 20 metabolic pathways of the modules black, cyan, lightcyan, and yellow after KEGG enrichment. Among the top 20 enrichment pathways after KEGG enrichment in the black module, the most significant enriched pathways were fatty acid biosynthesis, RNA transport, and sulfur metabolism, arranged by q value from smallest to largest. Among them, no enrichment pathway was significantly different. The number of genes in the enrichment pathway was also the highest in the synthesis of secondary metabolites. Among the top 20 enrichment pathways after KEGG enrichment in the cyan module, the most significant enrichment pathways were oxidative phosphorylation, peroxisome, cysteine and methionine metabolism, among which there were no significant differences. Among the top 20 enrichment pathways after KEGG enrichment in the Lightcyan module, the most significant enrichment pathways were the interaction of SNARE in vesicle transport, protein processing in the endoplasmic reticulum, and arginine and proline metabolism, among which there were no significant differences. enrichment pathways. Among the top 20 enrichment pathways after KEGG enrichment in the yellow module, the most significant enrichment pathways were porphyrin and chlorophyll metabolism, aminoacyl-tRNA biosynthesis, TCA cycle, etc. There were also no significant enrichment pathways.
Expression Analysis Of Genes In The Major Enrichment Pathways In Each Sample
By analyzing the enrichment pathways of genes in the modules with well correlation with salinity and strain, we identified the enrichment pathways with the most significant differences or a higher percentage of genes in the module. The genes in these enrichment pathways were then analyzed in terms of their expression in each sample. The heatmap of the expression of the genes of the most significant enrichment pathways proteasome and carbon fixation by photosynthesis in bule and red in each sample was shown in Fig. 9. While the genes in the most significant pathway and the genes in the pathway with the largest percentage of genes in tan module are 2 and 4 respectively, so they were not discussed. In the a-plot, all genes were highly expressed in samples Ctr-60, X3-60 and X5-60, especially in X5-60, where all genes were highly expressed relative to X5-30 and X5-45. This indicated that the proteasome was very active in all strains at a salinity of 60‰, and the expression of related genes was higher in X5-60 than in Ctr-60 and X3-60. The proteasome was mainly responsible for protein degradation, and it was speculated that the cells may adapt to the high salinity environment by remodeling the proteome of the cells through protein degradation. About carbon fixation in photosynthesis in the b-plot, it can be seen that almost all genes were highly expressed in Ctr-60, X3-60, and X5-60. It was speculated that the samples may be enhanced for carbon fixation in photosynthesis in order to be able to adapt to high salt environment. And it can be seen from the above expression that the expression of these genes in Ctr-60, X3-60 was higher than X5-50.
The heatmap of the expression of genes in the synthesis of secondary metabolites in black, and the metabolism of oxidative phosphorylation, porphyrins, and chlorophyll in the cyan and yellow modules, respectively, in each sample was shown in Fig. 10. In contrast, the genes in the most significant pathway in the lightcyan module and the genes in the pathway with the largest proportion of genes were 1 and 2, respectively, and therefore were not discussed. The expression of many genes in X5 in the a-plot was opposite to Ctr at a salinity of 30‰ (e.g. LXC000061, LXC004771, etc.). Even if there were genes whose expression was both up- and down-regulated, their expression was not the same in degree. Since the enrichment pathway in this part was not particularly precise, the genes in this part were enriched for KEGG again. The enrichment of genes in the pathway of secondary metabolites was shown in Fig. 11. In addition to the first enriched pathway, the main enriched metabolic pathways were fatty acid biosynthesis, cofactor biosynthesis, glycerol ester metabolism, fatty acid metabolism, etc. From the above it can be seen that the differential pathways were mainly related to lipids. In the b and c plots of Fig. 9 it can be seen that at salinity of 30‰, X5 was down-regulated in oxidative phosphorylation, porphyrin and chlorophyll metabolism relative to both Ctr and X3 in expression.
Association analysis of genes and traits
After the above analysis of the enrichment pathways and gene expression in the pathways in the modules associated with salinity traits and strain traits. Then which genes in these pathways were well associated with salinity traits and strain traits were analyzed. We intersected the genes in the proteasome and carbon fixation of photosynthesis with genes that had association with salinity traits. As shown in Fig. 12, the lines in red represented positive correlations between these genes and salinity tolerance, and the thickness of the lines represented the magnitude of the correlation, with thicker lines indicating larger correlations (correlations between 0.70 and 0.80). The filled color of the box for the genes associated with salinity traits represented the P value, with P < 0.05 indicating a valid correlation (P values for all the above correlations were less than 0.05). The shape of the box of the gene associated with the salinity trait represented the metabolic pathway in which the gene was located. From Fig. 12, it can be seen that five genes with well correlation with salinity traits were in carbon fixation for photosynthesis, while there were two genes in the proteasome that were well associated with salinity traits.
As shown in Fig. 13, genes with well correlations with the strains were shown. The blue color of the lines represented the negative correlation of these genes with the trait, and the thickness of the lines represented the magnitude of the correlation, with thicker lines indicating larger correlations (correlations between 0.70 and 0.88). The filled color of the box for the genes associated with the trait represented the P value, with P < 0.05 indicating a valid correlation (P values for all the above correlations were less than 0.05). The shape of the box of the gene associated with the strain trait represented the metabolic pathway in which the gene was located. As seen in Fig. 13, two of the genes associated with strain traits were in the oxidative phosphorylation pathway, while there was one associated gene in fatty acid metabolism, leaving seven genes all in porphyrin and chlorophyll metabolism. After the above analysis, we identified candidate genes with well correlation with salinity and strain, and the annotation of all these candidate genes is shown in Table 1.
Table 1
Annotation description of candidate genes
GeneID
|
Metabolic pathways
|
Function
|
LXC001520
|
Proteasome
|
Proteasome subunit beta type-1 [Gracilariopsis chorda]
|
LXC005913
|
Porphyrin and chlorophyll metabolism
|
Delta-aminolevulinic acid dehydratase, chloroplastic [Gracilariopsis chorda]
|
LXC000548
|
Oxidative phosphorylation
|
ATP synthase subunit d, mitochondrial [Gracilariopsis chorda]
|
LXC003646
|
Carbon fixation in photosynthetic organisms
|
Phosphoglycerate kinase, chloroplastic [Gracilariopsis chorda]
|
LXC007344
|
Proteasome
|
Proteasome subunit beta type-5 [Gracilariopsis chorda]
|
LXC003053
|
Porphyrin and chlorophyll metabolism
|
Uroporphyrinogen decarboxylase [Gracilariopsis chorda]
|
LXC002968
|
Porphyrin and chlorophyll metabolism
|
Glutamate-1-semialdehyde 2,1-aminomutase [Gracilariopsis chorda]
|
LXC000032
|
Carbon fixation in photosynthetic organisms
|
Fructose-1,6-bisphosphatase, cytosolic [Gracilariopsis chorda]
|
LXC007989
|
Oxidative phosphorylation
|
NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 10-B [Gracilariopsis chorda]
|
LXC001791
|
Porphyrin and chlorophyll metabolism
|
Magnesium-chelatase subunit ChlH, chloroplastic [Gracilariopsis chorda]
|
LXC002690
|
Carbon fixation in photosynthetic organisms
|
hypothetical protein BWQ96_06278 [Gracilariopsis chorda]
|
LXC004308
|
Carbon fixation in photosynthetic organisms
|
Fructose-1,6-bisphosphatase, chloroplastic [Gracilariopsis chorda]
|
LXC003712
|
Carbon fixation in photosynthetic organisms
|
Glyceraldehyde-3-phosphate dehydrogenase, chloroplastic [Gracilariopsis chorda]
|
LXC005122
|
Fatty acid biosynthesis
|
Enoyl-[acyl-carrier-protein] reductase [NADH] FabI [Gracilariopsis chorda]
|
LXC000696
|
Porphyrin and chlorophyll metabolism
|
Geranylgeranyl diphosphate reductase, chloroplastic [Gracilariopsis chorda]
|
LXC004426
|
Porphyrin and chlorophyll metabolism
|
Glutamate–tRNA ligase, chloroplastic/mitochondrial [Gracilariopsis chorda]
|
LXC005216
|
Porphyrin and chlorophyll metabolism
|
Ferrochelatase [Gracilariopsis chorda]
|
GeneID: gene number, Metabolic pathways: pathways where the gene is located, Function: functional description |