Metagenomic Shotgun Analyses Reveal Complex Patterns of Intra- and Interspecific Variation in the Intestinal Microbiomes of Codfishes

The composition of the intestinal microbial community associated with teleost fish is influenced by a diversity of factors, ranging from internal factors (such as host-specific selection) to external factors (such as niche occupation). These factors are often difficult to separate, as differences in niche occupation (e.g., diet, temperature, or salinity) may correlate with distinct evolutionary trajectories. Here, we investigate four gadoid species with contrasting levels of evolutionary separation and niche occupation. Using metagenomic shotgun sequencing, we observed distinct microbiomes among two Atlantic cod (Gadus morhua) ecotypes (NEAC and NCC) with distinct behavior and habitats. In contrast, interspecific patterns of variation were more variable. For instance, we did not observe interspecific differentiation between the microbiomes of coastal cod (NCC) and Norway pout (Trisopterus esmarkii), whose lineages underwent evolutionary separation over 20 million years ago. The observed pattern of microbiome variation in these gadoid species is therefore most parsimoniously explained by differences in niche occupation.

surrounding water influences the intestinal microbiome in fish larvae and fry (3,4); water temperature is the main driver for the gut microbiome composition in farmed Tasmanian Atlantic salmon (Salmo salar) (5); and diet influences the intestinal composition in both experimental (6)(7)(8)(9) as well as wild fish populations (10)(11)(12)(13). Yet internal factors also influence the composition of these bacterial communities. For instance, observations of a shared (core) microbiome between wild and laboratory-raised zebrafish suggest that distinct selective pressures determine the composition of the microbial communities (14). Moreover, an association between host phylogeny and intestinal microbiome composition has been observed for a range of fishes, marine animals, and terrestrial mammals (15)(16)(17)(18)(19). The adaptive immune system appears especially important for host selection. Individual variation of the major histocompatibility complex (MHC) II correlates with the gut microbiome composition in stickleback (20); mucosal IgT depletion causes dysbiosis in rainbow trout (Oncorhynchus mykiss) (21); and lack of a functional adaptive immune system reduces the strength of host selection in knockout zebrafish models (22). Among bony fish, gadoid fishes have an unusual adaptive immune system in which there is loss of MHC II, CD4, and invariant chain (Ii) and in which there is a range of innate (TLR) and MHC I immune-gene expansions (23,24). Moreover, Atlantic cod has high levels of IgM (25) and a minimal antibody response after pathogen exposure (25)(26)(27). Gadoids therefore provide an interesting ecological system to study hostmicrobiome interactions (28).
Studies that specifically integrate internal and external influences support a role for both factors driving the microbial community composition (13,29). Such studies, however, remain restricted in both the level of taxonomy of fishes (30) as well as taxonomical resolution of the microbial analyses (16S rRNA) (13,29,(31)(32)(33). Importantly, it often remains difficult to separate the correlated effects of distinct behavior (e.g., diet) and niche occupation with interspecific selection. Also, no comparative studies have used metagenomic shotgun sequencing to investigate fish populations with profound differences in behavior within a single species. It therefore remains unclear whether the microbial composition for a range of wild fish species is characterized by intra-or interspecific divergence.
Here, we study intra-and interspecific divergence of intestinal microbial communities within the widespread family of Gadidae using a metagenomic shotgun data set. We compare the microbiomes from Norway pout (Trisopterus esmarkii), poor cod (Trisopterus minutus), northern silvery pout (Gadiculus thori), and two ecotypes of Atlantic cod (Gadus morhua). These four species have overlapping geographical distributions, are dietary generalists in typically feeding over sandy and muddy bottoms on pelagic or benthic crustaceans, polychaetas, and (small) fish (34,35), and evolutionarily diverged approximately 20 million years ago (24). Norway pout is benthopelagic, distributed from the English Channel, around Iceland, and up to the Southwest Barents Sea. It is mostly found at depths between 100 and 200 m. Poor cod is also benthopelagic, distributed from the Trondheim Fjord in Norway to the Mediterranean Sea, and mostly found between 15 and 200 m. Northern silvery pout (Gadiculus thori) is meso-to bathypelagic (36), distributed in the North Atlantic Ocean, along the coast of Norway, and around Iceland and Greenland. It forms large schools that are usually found between 200 and 400 m (34,36,37). Finally, Atlantic cod has a trans-Atlantic distribution, from the Bay of Biscay to the Barents Sea, the Baltic Sea, around Iceland and Greenland, in the Hudson Bay, and along the North American coast (34,(38)(39)(40)(41)(42). Atlantic cod comprises various subpopulations and "ecotypes" with distinct adaptations, migratory behavior, and feeding behavior. For instance, northeast Arctic cod (NEAC) performs typical spawning migrations from the Barents Sea to the Norwegian coast, whereas the Norwegian coastal cod (NCC) remains more stationary (34,43). These ecotypes have increased genomic divergence in several large chromosomal inversions (43)(44)(45)(46)(47), suggestive of local adaptation. The environments that these two ecotypes encounter are different, and they feed on distinct types of food. NEAC consumes mostly capelin and herring while NCC feeds on a wide range of crustaceans, fish, and seaweed (34,39,48). During spawning, these ecotypes spatially co-occur, and long-term gene flow between ecotypes is supported by low overall estimates of divergence in most genomic regions, apart from the chromosomal rearrangements (43).
We hypothesize that if interspecific selection (indicative of host selection) is the main driver for the intestinal communities in the Gadidae, most differences will be found between the different species, and not between the different ecotypes within Atlantic cod. In contrast, if environmental factors are the main drivers for the intestinal communities, we expect significant compositional differences between the ecotypes of Atlantic cod, as well as various levels of differentiation between the species. We used taxonomic profiling of metagenomic shotgun reads to classify these microbiomes (obtained from various locations around the Norwegian coast [ Table 1]) at order-and species-level resolution and analyzed within-species differentiation of the most abundant members by genome-wide single nucleotide variation. Finally, differences in gut bacterial community composition among the species and ecotypes were assessed using multivariate statistics.

RESULTS
Taxonomical composition of the intestinal microbiomes. We analyzed a data set of 422 million paired-end reads, with a median sample size of 11.9 million reads (8.0 to 19.6 million reads per sample) ( Table 2, Table S7 in the supplemental material). Following filtering, order-level classification could be obtained for 93% of all sequences ( Table 2). Based on nonnormalized order-level sequence counts, we observed clear patterns of separation between species and ecotypes in a multivariate nonmetric multidimensional scaling (NMDS) plot (Fig. 1b), with NEAC and northern silvery pout forming distinct clusters, whereas the NCC populations encompass the Norway pout and poor cod populations. Vibrionales was the most abundant order detected in the intestinal microbiomes of NCC specimens at both coastal locations (mean relative abundance [MRA]: 76%) as well as Norway pout (MRA: 79%) and poor cod (MRA: 44%) ( Table 3, Fig. 2a), with the remainder of each gut community consisting of a mix of orders with low relative abundance. The intestinal microbiomes of the NEAC and northern silvery pout specimens had a significantly more diverse community composition (Fig. 3, Fig. 2a). NEAC was dominated by Bacteroidales (MRA: 21%), Vibrionales (MRA: 17%), Clostridiales (MRA: 12%), and Brevinematales (MRA: 7%), while northern silvery pout had a high relative abundance of orders Brachyspirales (MRA: 16%) and Clostridiales (MRA: 14%). Distinct from the gut communities of the other fish populations, northern silvery pout had a low abundance of Vibrionales. Finally, the amount of sequences in the "Others" category, as well as sequences classified above order level (mean of all samples was 7.8%), varied slightly between the fish species (Table S8). A species-level classification was obtained for 66% of all sequences. Overall, species of the genus Photobacterium comprised on average 40.6% of the classified sequences, ranging from 0.2% in northern silvery pout to 74.3% in Norway pout (Fig. 2b). In particular, P. kishitanii and P. iliopiscarium represented on average 43% and 36% of all Photobacterium species, respectively, although the ratio differed in the different fish species (e.g., 49% versus 41% in NCC; 16% versus 56% in NEAC; and 55% versus 12% in Norway pout). The NCC Lofoten intestinal microbiome was dominated by P. iliopiscarium (MRA: 21%) and P. kishitanii (MRA: 20%), followed by different species of Aliivibrio (A. wodanis, A. logei, and A. fischeri) (MRA: 13%) (Fig. 2b). Similarly, the bacterial gut community of Norway pout was also dominated by Photobacterium species, in particular P. kishitanii (MRA: 17%). The intestinal microbiome of poor cod was dominated by Photobacterium species (MRA: 18%), followed by different Vibrio spp. (MRA: 8%). The gut bacterial community of NEAC was more diverse, with high relative abundance of a Brevinema sp. (MRA: 31%) and different species in the genera Photobacterium (MRA: 34%), Clostridium (MRA: 12%), and Aliivibrio (MRA: 9%). The high abundance of Bacteroidales observed at the order level ( Fig. 2a) was not reflected at the species level, as this order represented a high number of Bacteroidales species with low abundance. Consequently, no Bacteroidales species were among the 15 most abundant species in the NEAC intestinal microbiome (Fig. 2b). The NEAC samples also contained a Mucispirillum sp. (MRA: 4%) and two Brachyspira spp. (MRA: 2%). In northern silvery pout, the gut microbiome was quite evenly distributed between the Brevinema sp., the Mucispirillum sp., Brachyspira pilosicoli, Brachyspira sp. CAG:700, and a group of different Clostridium species in two of three samples. The third sample contained the same species but had an even higher relative abundance of the Brevinema sp. (64%) (Fig. 2b).
Variation in bacterial community composition among species and ecotypes. Significant differences in within-sample diversity (alpha diversity) at the order level were observed among all species and within-species ecotypes, except between NCC and Norway pout ( Table 4, Table S5). None of the other covariates had a significant effect on alpha diversity. Similar to the results from the within-sample diversity, significant differences in community structure (beta diversity) were observed among the gadoid species at order, genus, and species levels (  in all the other gadoids (at a significance level of 0.05). The NCC intestinal microbiome was also different from that of both poor cod and northern silvery pout. In agreement with results of within-sample (alpha) diversity, no differences in community structure were observed between the microbiomes of NCC and Norway pout. Finally, no differences were observed between the gut microbiome of poor cod versus Norway pout, poor cod versus northern silvery pout, or Norway pout versus northern silvery pout (P ϭ 0.074 for all). Beta diversity analysis also demonstrated that community differences at the genus and species levels were similar to those observed at the order level (Table  S6). Differences in the intestinal community composition between these gadoids are predominantly explained by changes in the relative abundance of a limited number of orders. For example, different proportions of Vibrionales contribute 29% to the (Bray-Curtis) dissimilarity between the NCC and NEAC (P ϭ 0.001), followed by differences in the relative abundance of Bacteroidales, explaining 10% of the dissimilarity (P ϭ 0.001) (Table S9). Together, 80% of the observed dissimilarity between NCC and NEAC is explained by differences in their relative abundance of the top six orders. Similarly, 60% of the dissimilarity between NCC and northern silvery pout is driven by Vibrionales, Brachyspirales, and Clostridiales.
Bacterial within-species variation of SNV heterogeneity. We investigated bacterial within-species variation of P. iliopiscarium and P. kishitanii-with sufficient read coverage across all samples-among the different gadoids by mapping sequencing reads to their respective reference genomes (GCF_000949935.1 and GCF_000613045.2). In the samples used for single nucleotide variant (SNV) analysis, the mean percentage of the reference genomes with minimum 20-fold coverage (coverage breadth) after mapping were 63% for P. iliopiscarium and 19% for P. kishitanii. Hence, the variation analysis of the two species was based on different proportions of the reference genomes. The two reference genomes varied widely in the number of SNVs observed in all samples, from 84,866 in P. iliopiscarium to 1,229 in P. kishitanii (Fig. 4a). The density of variable sites within each individual sample showed various levels of heterogeneity in the bacterial populations (Fig. 4b). This heterogeneity was particularly clear in P. kishitanii, with site density varying from 0.5 to 45.4 variant positions per kbp per individual specimen. Further, the heat map showed gadoid-specific SNV patterns (Fig. 4c), in particular for P. iliopiscarium, where Norway pout contained a distinct pattern compared to the other gadoids, indicating the presence of specific P. iliopiscarium strain(s). Statistical analyses of SNV variation revealed that NEAC had a significantly different SNV pattern from Norway pout (Chi-square, P ϭ 0.017) and poor cod (P ϭ 0.028) for P. kishitanii, and from NCC (P ϭ 0.033) and Norway pout (P ϭ 0.000) for P. iliopiscarium (Fig. 4d, Table S10). NCC had a significantly different SNV pattern from Norway pout (P ϭ 0.003) for P. iliopiscarium. (Fig. 4d, Table S10). The relative abundance of P. kishitanii and P. iliopiscarium varied greatly among the fish specimens used in the variation analysis (Fig. 4e).  Colors represent the 28 orders with highest relative abundance, sequences assigned to other orders or viruses, and sequences classified above order level.
(Continued on next page)

DISCUSSION
Using metagenomic shotgun sequencing, we show the composition of the intestinal microbiomes of two Atlantic cod ecotypes (NEAC, NCC) to be at least as divergent as those found between the different codfish species investigated here. Our findings have several implications for our understanding of the composition of the intestinal microbiome in wild fish populations.
Although species-specific selection has been proposed as a factor driving the composition of the intestinal community in fish in a variety of settings (13, 14, 16-19, 29, 33), our results show that this may not be the most important driver among gadoid species in wild populations. First, we observed highly significant differences in the intestinal microbiomes at order, species, and within-species bacterial levels between the NEAC and NCC ecotypes. Despite showing different migratory behaviors, these ecotypes co-occur during seasonal spawning in northern Norway (Lofoten), from where most of the samples were collected (43)(44)(45). Second, we observed no significant bacterial order-or species-level differences in the intestinal microbiome between different gadoid species, Atlantic cod (ecotype NCC), and Norway pout, which were sampled from different geographical locations (Lofoten and Oslo Fjord). We did not observe differentiation between the NCC sampled from Lofoten and the Oslo Fjord (although statistical certainly was low), which reflects an earlier observed lack of geographical structure for this ecotype (30). The similarity of the microbial compositions of the NCC and Norway pout is striking, as these are distinctly different genetic lineages with an evolutionary separation of at least 20 million years (24). These results suggest that NCC and Norway pout occupy an environmental niche that allows bacterial members with a broad geographical distribution to colonize their intestinal communities. Overall, the observation of a significant differentiation between microbiomes from ecotypes of the same species and a lack of differentiation between microbiomes from two distinct species suggests that the intestinal microbiome in these gadoid species and ecotypes is not driven by species-specific selection alone.
There are several factors that may underlie the compositional differences in the NCC and NEAC intestinal microbiomes. First, for more than 10 months during the year, the two populations encounter different habitats, as the NEAC ecotype is distributed in the pelagic waters of the Barents Sea while NCC remains more stationary in coastal waters (49). Although several 16S rRNA-based studies have reported limited effects of geographic location on the composition and diversity of the fish intestinal microbiome (32,50), the Barents Sea has significantly lower temperatures (51) than Norwegian coastal waters (52). Temperature has been shown to have a significant impact on the intestinal microbiome in several studies, e.g., Senegalese sole (Solea senegalesis), Tasmanian Atlantic Salmon (Salmo salar), and mummichog (Fundulus heteroclitus) (5, 53, 54), but not in all cases (e.g., Atlantic salmon) (55). Second, the ecotypes were sampled during different seasons: NCC Lofoten during summer (August) and NEAC during winter/early spring (March). Nonetheless, a lack of difference between NCC Lofoten (August) and NCC Oslo Fjord (May) suggests that seasonality is unlikely to fully explain the observed differences between NEAC and NCC. Third, the ecotypes show different feeding behaviors; while the NEAC may perform vertical movements down to 500 m during foraging and spawning migrations from the Barents Sea (42,56,57), NCC mainly occupy shallow and warmer coastal and fjord waters (58). These behaviors expose the two ecotypes to different sources of food, with NEAC predominantly eating capelin and herring (48) and NCC living on a more diverse diet, including crustaceans, fish, and even seaweeds (34,39). Diet has been shown to influence the composition of the intestinal  microbiome in several fish species (9, 10, 13, 53, 59, 60). Finally, the Barents Sea has a high microbial biodiversity compared to coastal areas (61). The specific bacterial load in the surrounding waters also influences the intestinal microbiome composition in fish, including Atlantic cod (3,4). Nonetheless, because these different environmental and behavioral factors are correlated, it is unclear which of these parameters contributes the most to the observed differences in the intestinal microbiome composition between these ecotypes. Comparing two spatially separated coastal Atlantic cod populations, metagenomic shotgun data revealed no strain-level differentiation (30). In this study, we found specific SNV variants among the most abundant bacterial species that were associated with single species or specific Atlantic cod ecotypes. This indicates that NEAC harbor different strains of P. iliopiscarium than those identified in the NCC ecotype and the other gadoid species. Our current study encompasses a significantly greater geographical area and broader range of taxonomical samples than the earlier coastal comparison (30)(31)(32), and is indicative of strain-level variation at larger comparative scales. In line with Riiser et al. (30), this study shows that such strain-level differences cannot be detected using 16S rRNA techniques alone, and that metagenomic shotgun sequencing is currently the most accurate approach to detect strain-level spatial variation in the marine environment.
Most striking among the comparisons of gadoid species were the microbiome differences observed in NEAC, northern silvery pout, and poor cod compared to NCC and Norway pout. Several bacterial species that drive this differentiation are of particular interest. First, two bacterial species, Mucispirillum sp. and Brevinema sp., are almost exclusively detected in the intestinal microbiomes of NEAC and northern silvery pout. Nonetheless, these genera are represented by a single species in the RefSeq database (62) (accessed 10 January 2019) and hence little is known. Brevinema andersonii (order Brevinematales) was originally identified in short-tailed shrews (Blarina brevicauda) and white-footed mice (Peromyscus leucopus) and was found to be unable to grow below 25°C (63). Brevinema spp. have previously been identified in Atlantic cod (32) and in Atlantic salmon (64). Mucispirillum schaedleri (order Deferribacterales) is a mucosaassociated member of the intestinal microbiome in terrestrial animals such as pigs, goats, and rodents, where it is thought to be involved in mucus production through expression of lectins, important components in the innate immune response (65,66). Nevertheless, the distant relationship between Atlantic cod and these terrestrial hosts and the availability of only single reference genomes for Mucispirillum and Brevinema strongly suggest that the representatives found here are related but novel species with different intestinal ecologies and physiologies. Second, both NEAC and northern silvery pout contain significant fractions of Brachyspira spp., previously identified as dominant members in the gut of the carnivorous marine fish species mahi mahi (Coryphaena hippurus) (12,67). Brachyspira spp. are known as intestinal pathogens in pigs and humans (68,69), although recent studies show that Brachyspira spp. are more widespread in the wildlife community than previously thought, including in freshwater (70). The ecology of Brachyspira in the marine environment is unclear, although an association with the carnivorous diet of mahi mahi and NEAC may suggest that the diet of northern silvery pout also has a considerable carnivorous component. Third, poor cod is the only species with considerable abundance of Enterovibrio norvegicus (Table S11). This bacterium within the Vibrionaceae family was isolated from the intestines of cultured turbot (Scophthalmus maximus) larvae in Norway and classified as a novel species phenotypically similar to the Vibrio genus (71). Interestingly, poor cod also host the highest abundance of Vibrio spp. among the fish species in this study (Table S11). Other Enterovibrio species have been found in association with diseased corals (72) and internal organs of cultured fish species in the Mediterranean Sea (73)(74)(75). However, little is known about the function of this relatively novel genus in fish intestines.
Given the observations of species-specific selection for a similar microbiome in various teleosts and range of habitats (13, 14, 16-19, 29, 33), the diverse microbiomes within and among gadoid species may suggest that their intestinal communities could be more easily modulated by external factors. At this stage, limited sampling across various fish taxa and the lack of comparative approaches leave reasons for such diverse communities speculative. Nonetheless, it is interesting to note that all gadoids have an unusual adaptive immune system in which there is loss of MHC II, CD4, and invariant chain (Ii) and a range of innate (TLR) and MHC I immune-gene expansions (23,24). There are significant correlations between immune genes and the vertebrate microbiome (76,77), and it has been hypothesized that adaptive immunity has evolved to help maintain a complex community of beneficial commensal bacteria (78). Indeed, studies of wild-type zebrafish and knockout zebrafish without a functional adaptive immune system suggest that adaptive immunity increases the strength of host filtering of potential fish-associated microbes (22). The unusual adaptive immune system of gadoids may therefore affect the strength of coevolutionary associations within their microbiomes.
In conclusion, based on metagenomic shotgun sequencing, we here characterize the intra-and interspecific community compositions among two ecotypes of Atlantic cod and three related fish species in the Gadidae family. Several of these fish species harbor unique, and possibly novel, bacterial species. We identify a complex pattern of diversity with significant differences between the Atlantic cod ecotypes and variable interspecific patterns of variation. Although most species and ecotypes yield different communities, those found in coastal cod (NCC) and Norway pout are not significantly diverged, indicating that ecological niche plays an important role in determining the intestinal microbiomes in these gadoid species.  Table S1 in the supplemental material). NCC (2 individuals) were also collected in the Oslo Fjord (N58.9125100, E9.9202624 and N59.8150006, E10.5544914). Norway pout (Trisopterus esmarkii, 4 individuals), poor cod (Trisopterus minutus, 5 individuals), and northern silvery pout (Gadiculus thori, 3 individuals) were collected in the inner Oslo Fjord in May 2015 (Table S1). All fish specimens were collected from wild populations. A 3-cm-long part of the hindgut (immediately above the short, wider rectal chamber) was aseptically removed postmortem by scalpel and stored in 70% ethanol. The samples were frozen (-20°C) for long-term storage. Relevant metadata such as length, weight, sex, and maturity were registered. As we strive to reduce the impact of our sampling needs on populations and individuals, samples were therefore obtained as a by-product of conventional business practices. Specimens were caught by commercial vessels, euthanized by local fishermen, and intended for human consumption. Samples were taken postmortem and no scientific experiments have been performed on live animals. This sampling follows the guidelines set by the "Norwegian consensus platform for replacement, reduction and refinement of animal experiments" (79) and does not fall under any specific legislation in Norway, requiring no formal ethics approval.

MATERIALS AND METHODS
Sample preparation and DNA extraction. Intestinal samples were split open lengthwise before the combined gut content and mucosa were gently removed using a sterile disposable spatula. Each individual sample was washed in 500 l 100% ethanol (EtOH) and centrifuged before the ethanol was allowed to evaporate, after which dry weight was measured before proceeding to DNA extraction. DNA was extracted from between Ͻ10 and 300 mg dry weight of gut content using the MoBio Powersoil HTP 96 Soil DNA isolation kit (Qiagen, Valencia, CA, USA) according to the DNA extraction protocol (v. 4.13) utilized by the Earth Microbiome Project (80). DNA was eluted in 100 l elution buffer and stored at Ϫ20°C. Due to high methodological consistency between biological replicates in previous experiments, only one sample was collected per fish (32).
Sequence data generation and filtering. Quality and quantity of the DNA was measured using a Qubit fluorometer (Life Technologies, Carlsbad, CA, USA) and normalized by dilution. DNA libraries were prepared using the Kapa HyperPlus kit (Roche Sequencing, Pleasanton, CA, USA) and paired-end sequenced (2 ϫ 125 base pairs) on an Illumina HiSeq2500 using the HiSeq SBS V4 chemistry with dual indexing in two independent sequencing runs. Read qualities were assessed using FastQC (81) before adapter removal, singleton read identification, deduplication, and further read quality trimming was performed using Trimmomatic (ver. 0.36) (82) and PRINSEQ-lite (ver. 0.20.4) (83) (Table S2). PhiX, host, and human sequences were removed by mapping reads to the phiX reference genome (GenBank: J02482.1), the Atlantic cod genome assembly (gadMor 2) (this applied to all the fish species) (84), and a masked version of the human genome (HG19) (85) using BWA (ver. 0.7.13) (86) or BBMap (ver. 37.53) (87) (JGI) with default parameters and discarding matching sequences using seqtk (ver. 2012.11) (88). All sequence data have been deposited in the European Nucleotide Archive (ENA) under study accession number PRJEB31095.
Taxonomic profiling. Taxonomic classification of quality trimmed and filtered metagenomic pairedend reads was performed using Kaiju (ver. 1.5.0) (89) ("greedy" heuristic approach, -e 5), with the NCBI nr database (release 84) (including proteins from fungal and microbial eukaryotes) as reference (62). Counts of sequences successfully assigned to orders and species were imported into RStudio (ver. 1.1.383) (90), based on R (ver. 3.4.2) (91), for further processing. Filtering of the most abundant bacterial orders for visualization was based on a minimum relative abundance threshold of 1% of the total number of sequences per library (the threshold ranged from 5,933 to 95,146 depending on the sample size). Similarly, filtering of the most abundant bacterial species was based on a minimum relative abundance threshold of 2% of the total number of sequences per library (the threshold ranged from 6,548 to 190,294 depending on the sample size). Any taxon not exceeding this threshold in at least one (order-level) or two (species-level) samples was removed. All filtering was based on the R package genefilter (ver. 1.62.0) (92). Final results were visualized using the R package ggplot (ver. 2.2.1) (93). Note that, based on a recent reclassification (94), we refer to the reference strain Photobacterium phosphoreum ANT-2200 (accession number GCF_000613045.2) as Photobacterium kishitanii (Table S3).
Sequence variation analysis. In order to assess the heterogeneity of the most abundant bacteria in the fish species, we analyzed the sequence variation in the two genomes with the highest mean relative abundance over all fish species and ecotypes, Photobacterium kishitanii and Photobacterium iliopiscarium. Paired-end reads from each individual fish were mapped to the reference genomes (Table S3) using the Snakemake workflow (95) of anvi'o (ver. 5.1) (96) with default parameters in the "all-against-all" modus (with anvi-profile -min-coverage-for-variability 20). Samples of low coverage, restricting detection of SNVs in anvi'o, were excluded from the variation analysis. For each individual sample, variable sites were identified, and the mean number of these per 1,000 bp was calculated (variation density). A variable site required a minimum coverage of 20ϫ. Next, variable sites with a minimum of 20ϫ coverage in all samples were defined as single nucleotide variants (SNVs, anvi-gen-variability-profile -min-occurrence 1 -min-coverage-in-each-sample 20). Coverage, variation density, and SNV profiles were plotted in RStudio following the R script provided by anvi'o (97). The anvi'o SNV output was converted to .vcf format using a custom-developed script (https://github.com/srinidhi202/AnvioSNV_to_vcf), and the resulting .vcf files were used in a principal-component analysis (PCA) to test for population differences as implemented in smartpca (ver. 6.1.4) (EIGENSOFT) (98).
Statistical analysis. Although included in data visualization, the Oslo Fjord NCC samples were excluded from statistical analysis due to low sample size (n ϭ 2). Within-sample diversity (alpha diversity) was calculated using the diversity function in the R package vegan (ver. 2.4-1) (99) based on Shannon, Simpson, and inverse Simpson indices calculated from nonnormalized order-level read counts (Table S4). Differences in alpha diversity were studied using linear regression. The "optimal model" (the model that best describes the individual diversity) was identified through a "top-down" strategy, including all covariates (Table S5) except weight, which highly correlated with length (r ϭ 0.95), and selected through t tests. Model assumptions were verified through plotting of residuals. Differences in bacterial community structure (beta diversity) between the fish species or ecotypes were visualized using nonmetric multidimensional scaling (NMDS) plots based on the Bray-Curtis dissimilarity calculated from order-level sequence counts. Next, pairwise differences in beta diversity between the fish species or ecotypes were tested using permutational multivariate analysis of variance (PERMANOVA) in the R package pairwise.adonis (ver. 0.1) (100), a wrapper for the adonis functions in vegan (ver. 2.4-1), based on Bray-Curtis dissimilarity calculated from order-, genus-and species-level sequence counts. The package pairwise.adonis was run with 20,000 permutations, and P values were adjusted for multiple testing using the Holm method (101). Adjusted P values of Ͻ0.05 indicate statistical significance. PERMANOVA assumes the multivariate dispersion in the compared groups to be homogeneous; this was verified (P Ͼ 0.05) using the betadisper function (vegan) ( Table S6). Similarity percentage (SIMPER) procedure implemented in vegan was used to quantify the contribution of individual orders to the overall Bray-Curtis dissimilarity between the species/ecotypes. All beta diversity analyses were based on sequence counts normalized using a common scaling procedure, following McMurdie and Holmes (102). This involves multiplying the sequence count of every unit (e.g., order) in a given library with a factor corresponding to the ratio of the smallest library size in the data set to the library size of the sample in question. Normalization using this procedure effectively results in library scaling by averaging an infinite number of repeated subsamplings. We used chi-squared statistics, as implemented in smartpca (98), to test for significant differences in the distributions of SNVs per reference genome while correcting for multiple testing using sequential Bonferroni (101).
Data availability. The data set generated and analyzed for this study is available in the European Nucleotide Archive (ENA) under study accession number PRJEB31095.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. SUPPLEMENTAL FILE 1, XLSX file, 0.6 MB.

ACKNOWLEDGMENTS
We thank Børge Iversen and Helle Tessand Baalsrud for their kind help in sampling Atlantic cod specimens in Lofoten and Martin Malmstrøm, Paul Ragnar Berg, and Monica Hongrø Solbakken for sampling at Sørøya. We are grateful for the metagenome sequencing performed at the Norwegian Sequencing Centre (NSC: https://www .sequencing.uio.no).
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. S