Metagenomic Analysis of The Effects of Salinity On Microbial Diversity and Functional Gene Diversity in Kongsfjorden Estuary

Due to the inflow of meltwater from the Midre Lovénbreen glacier upstream of Kongsfjorden, the salinity of Kongsfjorden increases from the estuary to the interior of the fjord. Our goal was to determine which bacterial taxa and metabolism-related gene abundance were affected by changes in salinity, and whether salinity is correlated with genes related to nitrogen and sulfur cycling in fjord ecosystem using metagenomic analysis. Our data indicate that changes in salinity may affect some bacterial taxa, such as the relative abundance of Alphaproteobacteria and Deltaproteobacteria is higher at high salinity sites, while the relative abundance of Gammaproteobacteria and Betaproteobacteria is more dominant at low salinity sites. In addition, the relative abundance of some bacteria at the high and low salinity sites was different at the family level. Flavobacteriaceae, Vibrionaceae at the high salinity site Colwelliaceae, Chromatiaceae and Alteromonadaceae at the low salinity site are affected by salinity. In terms of functional gene diversity, our study proved that salinity could affect the relative abundance of related genes by affecting the metabolic mechanism of microorganisms. In addition to salinity, functional attributes of microorganisms themselves were also important factors affecting the relative abundance of metabolism-related genes. In addition, salinity has a certain effect on the relative abundance of genes related to nitrogen and sulfur cycling.


Introduction
Over the past few decades, the global climate has changed dramatically, with the effects becoming more pronounced in the polar regions (Wang et al. 2019). This would lead to higher temperatures in the Arctic Ocean, including Spitsbergen, causing the glaciers to melt earlier in the annual cycle and then freeze, increase precipitation and reduce sea ice cover. In addition, glacial melt drains downstream, leading to changes of biogeochemical and nutrient salt in downstream ecosystems (Bhatia et al. 2013;Hawkings et al. 2015;Hood et al. 2015;Hood Berner 2009;Lawson et al. 2014;Schroth et al. 2014).
Due to the strong changes in salinity and chemical properties of sea water, the structure and abundance of microbial community in downstream fjords and estuaries will change. Microorganisms in marine ecosystems are the most important drivers of biogeochemical cycling on a global scale (Azam et al. 1983), as they are responsible for the remineralization of organic matter and the transfer of nutrients and energy to higher levels in the ocean (Mason et al. 2009;Nealson 1997). However their taxonomic composition and metabolic function are influenced by "bottom-up" physical and chemical factors, such as salinity, nutrient concentration and availability (Lozupone Knight 2007;Yokokawa et al. 2004).
Therefore, the natural processes that cause physiochemical dynamics changes greatly affect the bacterial community, resulting in rapid changes in community structure and function, and thus have an impact on the aquatic environment ).
Kongsfjorden, located on the west coast of Svalbard at 79°N, is a glacier-open fjord in the Arctic.
Since arctic glacial fjords are characterized by the discharge of fresh water and suspended matter from glaciers on top of fjords, the stable ocean at the entrance to the fjord becomes a very unstable brackish water in the inner basin of the fjord (Syvitski et al. 1987). As a passage connecting the Atlantic Ocean and the Arctic Ocean, Kongsfjorden has attracted extensive research attention in recent years. Previous studies have shown that changes in salinity and sediment load due to the inflow of glacial meltwater into Kongsfjorden are the main determinants of changes in microbial community composition and diversity in Kongsfjorden (Piquet et al. 2010), consistent with another research which indicated salinity is a selective pressure governing global bacterial distribution (Lozupone Knight 2007). However, the characteristics and metabolic potential of the fjord bacterial community have not been determined.
Since bacteria are the basis of major biogeochemical cycles, the changes of high trophic levels and the whole ecosystem will determine the community characteristics and functions. Therefore, how salinity changes affect bacterial communities is important.
Metagenomics overcomes the constraints faced by culture-oriented microbiology institutions and can be used as a search tool for detailed screening of microbial community species present in ecosystems (Martinez-Porchas et al. 2017). Because DNA extracted from environmental samples is a microcosm of the entire microbial community of an ecosystem, metagenomic analysis can provide a more comprehensive assessment of the entire microbial community (Lagkouvardos et al. 2016). Thus advances in community analysis, using metagenomic methods, have made it possible to characterize bacterial taxonomic composition and functional metabolic potential. In this case, the aim of this study was to investigate whether salinity is an important factor affecting the bacterial community structure and relative abundance of metabolic genes in the fjord. To do this, we set up three sampling sites in Kongsfjorden: the intersection of glacial meltwater and the edge of the fjord (S5), the coastal waters (S6) and the interior of Kongsfjorden (S7). The salinity of the three sites varies significantly due to the input of glacial meltwater and their location in the fjords, our hypothesis is that salinity is an important factor affecting bacterial community structure and major metabolic genes in the fjord ecosystem, therefore, in this study, we focused on the effects of salinity changes on the bacterial community structure and the relative abundance of functional genes.

Study sites and sample collection
Kongsfjorden is a polar fjord located west of Svalbard located between 78°04′N-79°05′N and 11°03′E-13°03′E. The fjord is characterized by a low tidal difference (~2 m) and is strongly influenced by topography and the adjacent ocean. The western coastal waters of Svalbard are affected by the northernmost extension of the North Atlantic Current. The Midre Lovénbreen glacier is located in the Kongsfjorden region. Glacial meltwater flows into the Kongsfjorden through glacial runoff and provides the main fresh water resources. So there is an increasing trend in salinity from the glacial melt water into the estuary to the fjord interior (Piquet et al. 2010).
Three sampling points will be set up in the waters of Kongsfjorden, namely a place where the estuary of Kongsfjorden (S5, 12°01´50.9"E and 78°54´54.2"N), the coastal sea area (S6, 12°02´6.9"E and 78°54´55.7"N), and interior of the fjord (S7, 12°02´32.8"E and 78°55´2.7"N) (Fig. 1). The sampling points are 500 meters apart, water samples were directly collected into TWIRL'EM sterile sampling bags (Labplas Inc., Canada). The microbial samples were then collected by filtering 1000 ml of the water samples. The microbial biomass was successively trapped onto 47-mm-diameter, 0.2-μmpore-size membrane filters (Pall Corporation, USA). Membrane filters were placed in centrifuge tubes at −20°C in the Yellow River Station (China) and taken to the laboratory by plane. Filters were then frozen at −80°C until nucleic acid extraction.  In the metagenome analysis, reads of low quality (containing 40 or more bases with scores <20) or with more than 10 unknown bases (Ns) were removed using SOAPaligner (Santamans et al. 2017).
Clean reads from each sample were assembled using SOAPdenovo2 using default parameters (Gu et al. 2013). The reads not assembled into scaftigs (continuous sequences within scaffolds) in each sample were pooled together to form a mixed sample, which was reassembled. All scaftigs shorter than 500 bp were discarded.
Open reading frames in the scaftigs of each sample and of the mixed sample were predicted using MetaGeneMark (Luo et al. 2012). Redundant genes in each sample were removed using CD-HIT with 95% identity and 90% coverage (Zhu et al. 2010). The resulting genes were further filtered with the number of reads assembled, and those assembled with less than two reads were discarded to generate a gene catalog (unigenes). Unigenes were then filtered using DIAMOND (W. Li et al. 2012) against the reference genes in the NCBI-nr database at a cut-off e-value of 1E-10. In the functional analysis, DIAMOND was used to map unigenes against reference genes in KEGG (Buchfink et al. 2015).

Results and Discussion
Chemical properties analysis of seawater Due to the damage of a parallel sample S7.2 from the sampling site S7 during transportation, there are only two parallel samples of S7 in the subsequent experiment. However, from the perspective of various indicators of chemical properties, S7 parallel sample has very good parallelism, so it does not affect the subsequent experimental results.
Salinity of the three sampling sites varies greatly (one-way ANOVA, P < 0.05; Table S1), showing an increasing trend from S5 to S7 (Table 1). In addition to salinity, other six chemical factors which are important parameters in marine research were determined at the three sampling sites: pH, NH4 + -N, NO3 --N, NO2 --N, PO4 3--P and SiO4 2--Si. The results showed that the maximum values of SiO4 2--Si (92.48 μg/g) and NO3 --N (5.26 μg/g) were detected in S7 samples, the maximum values of NH4 + -N (33.29 μg/g) and NO2 --N (3.24 μg/g) were detected at S6 sampling points. In general, salinity is the chemical property with the greatest difference among the three sampling sites, so we infer that salinity may have a certain influence on the driving bacterial community structure and gene diversity. Moreover, our research focuses on the effects of salinity on bacterial community diversity and abundance of functional genes, so we rarely discuss the effects of other physical and chemical properties on bacterial diversity and gene diversity.  Statistical significance was assessed by one-way ANOVA followed by Tukey's HSD test, and significant differences were accepted when p < 0.05 between the two groups. The letters a, b, and c were used to show statistically significant differences.

Metagenomic sequencing results and gene prediction
Illumina HiSeq sequencing platform was used to obtain the original data, and the following data were obtained after preprocessing and statistics ( We conducted gene count statistics for each sampling site and drew a Venn diagram ( Fig. 2), among which 1193970 genes were shared by the three sampling sites. The number of unique genes for S5, S6 and S7 was 293182, 79733 and 43216, respectively.
In the metagenomic library of 3 sample sites, 11.7%, 17% and 13% of the genes were unannotated, respectively.  Effect of salinity on bacterial community Since we wanted to explore the effect of salinity on bacterial community structure, we analyzed the correlation between salinity and bacterial taxa with large differences in relative abundance between sampling sites, and the bacteria taxa with high correlation coefficient with salinity were screened out (Table 3). Proteobacteria are abundant in all three sampling sites at 58.51%-62.98% (Fig. 3a).
Alphaproteobacteria (25.79%) and Deltaproteobacteria (0.35%) are relatively abundant in low salinity sites, while the abundance of Gammaproteobacteria (39.67%) and Betaproteobacteria (1.5%) is higher in high salinity sites (Fig. 3b, one-way ANOVA, P < 0.05; Table S2). At the family level, the abundance of Rhodobacteraceae (12.23%), Pseudoalteromonadaceae (8.45%), Flavobacteriaceae (10.44%) and Vibrionaceae (0.69%) is relatively high in low salinity sites (Fig. 3c, one-way ANOVA, P < 0.05; Table   S3). With the increase of salinity gradient, the abundance of Actinobacteria (0.38%-1.2%) increases gradually. Especially, the abundance of Acidimicrobiaceae (0.06%-0.26%) and Microbacteriaceae (0.08%-0.68%), which belong to Actinobacteria, is higher in S6 and S7 than in S5. At the family level, the abundance of Colwelliaceae (8.33%), Chromatiaceae (0.46%) and Alteromonadaceae (11.42%) is higher in high salinity sites than that in low salinity sites (one-way ANOVA, P < 0.05; Supplementary Salinity can be an important factor in driving microbial diversity (Bouvier del Giorgio 2002;Lozupone Knight 2007;Tamames et al. 2010;Wu et al. 2006) and controls global microbial distribution (Lozupone Knight 2007). Our data confirm that certain bacterial groups are strongly correlated with salinity. The number of genes in low-salinity loci was higher than that in high-salinity loci, and Banda et al. (2020) had pointed out that species diversity would decrease in high-salinity environments. We first screened out bacterial taxa that were strongly correlated with salinity and discussed their relative abundance differences. As can be seen from the figure (Fig. 3b), Alphaproteobacteria and Deltaproteobacteria have high abundance in S5. These classes are ubiquitous in the marine environment and contain many marine species with high abundance (Biers et al. 2009;Capo et al. 2020). The high abundance of Rhodobacteraceae and Pseudoalteromonadaceae observed in S5 due to the rapid response of these families to the input of foreign nutrients caused by glacial melt water (Fig. 3c), their high abundance may be an indication of coastal surface water disturbance events (Allers et al. 2007;Nogales et al. 2011). In addition, high abundance of Flavobacteriaceae and Vibrionaceae are also detected in S5. The high abundance of the former due to the fact that this bacterium prefers to utilize complex organic matter by directly attaching to algal cells and algal derivative clastic particles (Y. Li et al. 2018), while the fresh water of S5 is more conducive to the growth of algae (Buchholz Wiencke 2016), so salinity indirectly affects the abundance of Flavobacteriaceae. The high abundance of Vibrionaceae can be ascribed to its association with soil microbiota (Reen et al. 2006), suggesting that Vibrionaceae may be terrigenous microorganism, which are imported into the fjords by glacial meltwater and are subjected to salinity stress, which is not so abundant in high-salinity areas.
Actinobacteria are common degrading bacteria in soil and ocean (Bull et al. 2005;Magarvey et al. 2004), which were initially thought to be transferred to marine environment through terrestrial runoff (Bull et al. 2005). However, in our figure, the abundance of Actinobacteria in S6 and S7 is higher than that of S5 (Fig. 3a), and the abundant of Acidimicrobiaceae and Microbacteriaceae in S6 and S7 confirm that Actinobacteria are the resident members of marine environment (Cottrell et al. 2005;Han et al. 2003;Rusch et al. 2007) and high-salt environment (Ghai et al. 2011) (Fig. 3c). The abundance of Gammaproteobacteria is higher in high salinity S7 (Fig. 3b), which is consistent with previous studies (Paver et al. 2018). Systematic evolution of the proteobacteria along salinity gradients shows the effect of salinity on the structure (Wu et al. 2006), suggests that salinity is an important factor controlling the composition of the microbial community in Kongsfjorden. The high abundance ratios of Colwelliaceae, Chromatiaceae and Alteromonadaceae observed in high salinity S7 may be attributed to their sensitivity to salinity (Fig. 3c), proving that they may belong to marine bacterium (Kwak et al. 2012;Methé et al. 2005;Pfennig Trüper 1981) and they may have "exclude salt" mechanisms such as osmotic balance and prevent dry (Banda et al. 2020). In this adaptation, the microorganism synthesizes the corresponding solute to help stabilize the cell membrane structure.   By distributing the top 35 functional genes with the highest gene abundance (Fig. 5), it can be found that there are great differences in gene functional abundance among the three sampling points, especially S5 and S7. We focused on the abundance of metabolism related genes and found that most metabolism related genes are abundant in S5, especially in Energy metabolism (3.09%), Amino acid metabolism (5.05%), and Carbohydrate metabolism (4.31%), while Nucleotide metabolism (2.39%) and Lipid metabolism (1.33%) are abundant in S7.
The abundance of most metabolism related genes in S5 is high, suggesting that the Kongsfjorden estuary is a highly competitive environment where bacteria must exhibit the ability to take advantage of changing nutrient conditions by demonstrating multiple metabolic pathways ).
Most of the S5 metabolism related genes are concentrated in energy metabolism, amino acid metabolism and carbohydrate metabolism (Fig. 5). The influx of glacial meltwater in the upper reaches of Kongsfjorden promotes the accumulation of carbon-rich substances, and the conversion of inorganic carbon to organic matter through carbon sequestration is the main functional attribute of the ecosystem.
Chemoautotrophs also use inorganic carbon to some extent (Feisthauer et al. 2008). Although heterotrophic prokaryotes utilize organic carbon, they can integrate dissolved inorganic carbon through extensive carboxylation reactions (Reddy et al. 2019). The high abundance of energy metabolism related genes and carbohydrate related genes at S5 site is due to the rich microbial community at the low salinity site, which constitutes an effective carbon sequestration system in the Fjord estuary through autotrophic and heterotrophic mechanisms. Proteobacteria and actinomycetes have previously been reported to play a major role in carbohydrate metabolism and carbon fixation (Reddy et al. 2019).
The enhancement of amino acid metabolism can enable microorganisms to successfully obtain nutrients in the context of intense nutrient competition. For example, increased nutrient acquisition by highly reactive microbial groups in S5 leads to the production of reactive oxygen species through nutrient oxidation, metabolism, and cellular respiration (Cabiscol Català et al. 2000) and causes overexpression of cellular responses to oxidative stress, of which glutathione is a major component (Klatt Lamas 2000).
In high salinity environment S7, halophilic and salt-tolerant microorganisms must maintain their cytoplasm at least isosmotic with their environment in order to withstand high salinity environment (Oren 2002). The strategy of some of these organisms is to remove as much salt from the cytoplasm as possible and to accumulate organic solutes to provide osmotic equilibrium. There are a variety of compounds that can be used for this purpose, from glycerol and other sugar alcohols to cytidine (Galinski 1995), leads to high abundance of genes related to fat metabolism and nucleotide metabolism in high-salt environments.
Denitrification related genes (nirK, nirS, norC, norB and nosZ) and Anammox related genes (hdh) are rarely distributed in S7. In addition, in terms of abundance, the abundance of most genes related to nitrogen cycle in S5 is higher than that in the other two sample sites. We analyzed the correlation between the abundance and chemical properties of nitrogen cycling related genes at each sampling site (Table 4). Some genes of dissimilated nitrate reductase and assimilative nitrate reductase were not highly correlated with salinity, while most of the genes related to nitrogen cycle were highly correlated with salinity.
In order to distinguish the microbial groups to which the genes related to nitrogen metabolism belong, the relative abundance map of the groups to which the genes belong was drawn (Fig. 6b). The results show that different nitrogen cycling related genes belong to different microbial groups. For example, the denitrification related gene nosZ related microbial taxa in S5 are mainly composed of Gammaproteobacteria, Alphaproteobacteria and Flavobacteria, while the denitrification related gene nosZ related microbiota in S6 and S7 are mainly composed of Gammaproteobacteria. However, in general, the microbial taxa belonging to nitrogen cycling related genes at low salinity sites are mainly composed of Gammaproteobacteria and Alphaproteobacteria, while those at high salinity sites are mainly composed of Gammaproteobacteria.
The abundance of nitrogen cycling related genes and the microbial community structure carrying nitrogen cycling related genes are studied under different salinity. We believe that salinity has an effect on the abundance of genes associated with the nitrogen cycle. When salinity increases from 5.10 to 30.11 ppt, the abundance of most genes related to nitrogen cycle decrease (Fig. 6a). The abundance of nitrification related genes (amoC, hao and nxrB) and denitrification-related genes (nirK, nirS, norB, norC and nosZ) is high at low salinity sites. Previous studies have found that estuarine nitrifiers grow best at 5-10 ppt and will be inhibited if the salinity exceeds 10 ppt (Zhou et al. 2017), which means that nitrification related genes are most abundant at sites with salinity less than 10 ppt. Denitrification, which returns nitrogen to the atmosphere as N2O and N2, shows a different relationship with salinity. In some estuaries, the abundance of denitrification related genes is negatively correlated with salinity in the range of 0-36 ppt (Giblin et al. 2010). The abundance and potential of denitrifying bacteria are associated with low salinity around 5 ppt (Franklin et al. 2017;Marton et al. 2012). Salinity can affect denitrification by altering the organic substrates necessary for heterotrophic bacteria (Franklin et al. 2017). Therefore, the high concentration of NO3 --N at high salinity can be attributed to the low abundance of denitrifying bacteria at high salinity. In addition, the reduction of dissimilated nitrate to ammonium (DNRA) (nirB nirD and nrfA) can compete with denitrification for nitrate, we observed higher concentrations of NH4 + at the lower salinity sites (Table 1), indicating less NH4 + consumption, which may be due to increased DNRA activity, further counteract the NH4 + consumption (Marchant et al. 2014). A recent study also showed that, although anammox may be inhibited at higher salinity, increases in salinity below 15 ppt can stimulate anammox (Jin et al. 2012). This may explain our observation that the abundance of anammox-associated genes (hdh) is also high when salinity ranges from 5.1 to 15.16 (Fig. 6a).
We suggest that salinity affects the abundance of genes associated with the nitrogen cycle by affecting gene-carrying microbial populations. Sahan and Muyzer (2008) showed that salinity was the main factor controlling the distribution of microorganisms associated with the nitrogen cycle.
Gammaproteobacteria and Alphaproteobacteria are dominant in most of the nitrogen cycling related genes in the three sample sites, but Alphaproteobacteria has a slight limitation under high salinity (Fig.   6b), which is consistent with the low abundance of Alphaproteobacteria in high-salinity areas (Fig. 3b).

Effect of salinity on sulfur cycle
In our KEGG annotated gene, the sulfur cycle is mainly controlled by two metabolic processes, namely sulfur oxidation and sulfur reduction (Cao et al. 2014) (Fig. 7a). Our results indicate that sulfur cycling related genes account for the largest proportion in S5. The sulfur oxidation genes (SoxAX, SoxY, SoxZ and SoxB) are similar in composition of related microorganisms (Fig. 7b) About half of the sulfur cycling-related genes in our statistics are highly correlated with salinity, all of these sulfur cycling-related genes were negatively correlated with salinity (Table 5).
Fjords are an important part of the global ecosystem and play an important role in the sulfur cycle.
Sulfur oxidation (Sox) is usually driven by sulfur-oxidizing bacteria, which oxidize reduced sulfide compounds, including elemental sulfur, sulfides, and thiosulfates (Yang et al. 2013). The Sox enzyme system is widely present in known sulphur oxidizing bacteria. In this study, the Sox enzyme system was detected to contain four gene: SoxAX, SoxY, SoxZ and SoxB (Fig. 7a). Sox related genes are mainly composed of Gammaproteobacteria, Betaproteobacteria and Alphaproteobacteria (Fig. 7b). Yang et al. (2013) have found that the response of a given sulphur oxidizing bacteria population to increases of salinity consists of successive changes in community structure but not of gradual adaptation of the sulphur oxidizing bacteria population. Because sulphur oxidizing bacteria belong to various classes of Proteobacteria, the response pattern of sulphur oxidizing bacteria to increases of salinity is consistent with that of proteobacterial classes (Yang et al. 2013). Our study shows that Alphaproteobacteria is dominant at low salinity, while Gammaproteobacteria is dominant at high salinity (Fig. 3b). Consistently, in this study, Alphaproteobacterial sulphur oxidizing bacteria with high abundance is detected in low salinity sites, while Gammaproteobacterial sulphur oxidizing bacteria with high abundance is detected in high salinity sites (Fig. 7b). Bacterial sulfate reduction has important ecological and geochemical significance in marine high-salt sediments (Oren 1988). Notably, besides the carry of the genes related to sulfate reduction by Gammaproteobacteria and Alphaproteobacteria, Flavobacteriia is dominant in the microbial populations belonging to sulfate reduction related genes cysD and cysN (Fig. 7b). In addition, Flavobacteriia plays an important role in sulfate reduction, which is consistent with the research results of Y. Li et al. (2018).

Conclusion
We used metagenomic analysis to demonstrate that changes in salinity can affect the relative abundance of some bacterial taxa on the one hand, and also change the relative abundance of functional genes and genes related to the nitrogen and sulfur cycles on the other hand.