Population dynamics of Baltic herring since the Viking Age revealed by ancient DNA and genomics

Significance A growing body of research indicates that ocean ecologies have been more impacted by human exploitation and for longer than previously understood. Here, we evaluate human impact on Baltic herring, an ecologically, culturally, and economically important species with an iconic history of exploitation. We observe genomic evidence of the earliest long-distance trade for this species, providing evidence for a longer exploitation history than previously understood. Observations of serial exploitation are consistent with classic patterns of resource depletion. Our results illustrate the importance of including knowledge regarding long-term population dynamics, including differential stock responses to climate change, in sustainable management strategies, as efforts to achieve food security by aquaculture- and marine-based industries are demanding ever-increasing resources from the oceans.


SNP calling and filtering
Modern samples were filtered using bcftools v1. removed. An additional dataset with no MAF frequency filtering was created for gone analyses known to be sensitive to removing minor alleles. Mitogenomes were called and filtered as above for ancient and modern samples. Individuals with >30% missingness were removed.
Removing outliers from the whole-genome dataset of modern herring specimens Our dataset contains whole genome data of herring specimens obtained from a range of difference sources, collected over a number of years (see Supplementary Dataset S2). The dataset further combines publicly available data with data generated de novo for our study. Given this wide range of sources, we performed a number of exploratory analyses to ensure data integrity. First, we previously identified that two of the individuals from the Han et al. 9 dataset were technical duplicates 10 using KING 11 (Table S1). One of these individuals was chosen at random to be included in the dataset and the other was discarded. Second, exploratory population analyses using smartPCA 12,13 indicated that several individuals were significant outliers ( Figure S2). To examine this pattern, levels of SNP-based levels of heterozygosity were calculated for the modern nuclear data using --het from VCFtools 8 .
Three individuals (HER_Z12_IsleOfMan, HER_NSSH34, and M-HER004) showed unusually high levels of heterozygosity ( Figure S3), which is a measure of possible poor read mapping and contamination 16 . These individuals therefore show unusually high similarity in genetic information and/or poor read mapping yet high heterozygosity. This could be the result of cross-contamination between specimens. We further assessed the possibility of contamination by analyzing levels of heterozygosity along the mitogenome ( Figure S4). As a haploid sequence, expected heterozygosity for each locus is 0. For those individuals containing loci with heterozygosity >0 this is indicative of contamination. Again, three individuals (HER_NSSH33, HER_NSSH34, and HER_Z12_IsleOfMan) showed clear signs of contamination. Based on these results, four individuals (HER_Z12_IsleOfMan, HER_NSSH33, HER_NSSH34, and M-HER004) were removed from the dataset. Finally, one individual (AAL1_CelticSea) from the Han et al. (2020) dataset was removed as it consistently clustered with individuals with non-matching metadata ( Figure S5). Given other inconsistencies of the metadata, including the duplicated sample, this sample was also removed. The cleaned dataset was used for demographic (runs of homozygosity, KING, and gone) and for whole-genome PCA analyses.

BAMscorer sensitivity analysis
In order to determine which ancient sequences could reliably be scored, we first assessed the required read depth for accurate assignment in each of the three assignment tests (Chromosome 12 inversion, spawning season, and salinity adaptation). Required read depth per assignment was assessed following the down-sampling and bootstrap method of Ferrari & Atmore et al. 10 . We selected 8 samples as a training dataset from the modern reference dataset to independently test the power of assignment probability with known metadata. The alignment files of these eight samples were randomly down-sampled to between 1k and 100k reads 20 times and then used to assess BAMscorer sensitivity for each assignment test.
Given the relatively low number of autumn spawning herring, we investigated whether including the removed contaminated individuals (see above) impacted the biological patterns for each comparison. Although these outliers impact the whole genome population analyses, SmartPCA analysis on each of the three BAMscorer assignment databases showed that including these outliers associated with contamination bias did nothing to change the shape of the distributions. Due to limited sample size for the different categories in these assignment tests, several outliers were left in the BAMscorer databases apart from AAL1_CelticSea (suspected incorrect metadata) and Gavle54 (identical to Gavle98) from the Han et al. 9 dataset. Full information on which samples were used for each stage of analysis can be found in Supplementary Dataset S2.
The modern database showed strong differentiation between spring and autumn/winter spawning seasons, following previously reported results 9,17,18 . Sensitivity analysis showed that spawning season can be accurately determined in alignment files with as few as 50 000 reads ( Fig S6). The chromosome 12 inversion could be confidently assigned with as few as 5 000 reads ( Fig S7) using default parameters. Given that only samples for which spawning season could be assigned were analyzed, all samples still retained at least 50 000 reads. Salinity scores could be determined for samples with at least 60 000 reads ( Fig S8).

Evaluating demographic independence
Demographic independence of the Baltic subpopulations was further assessed by calculating individual pairwise relatedness for each group with KING. To assess the substructure in the Baltic, metapopulations were grouped, resulting in two Baltic populations: autumn spawners and spring spawners (for full KING results, see Supplementary Dataset S2). In the Transition Zone (TZ) population, M-HER066 showed strongly negative kinship coefficients with the rest of the TZ individuals. This is a sign that there is population structure, thus M-HER066 is likely not actually a TZ individual, but could be part of the Atlantic spring spawners population, which is where it clusters on the PCA. M-HER066 was removed from analysis to eliminate outlier bias. Kinship coefficients were plotted per population to visualize the distribution of relatedness in each population (See Fig S9).
A one-way ANOVA showed that metapopulation ID was significantly associated with kinship coefficient mean and variance (p=8.16e-14, DF=4). Baltic spring spawners had the lowest average kinship coefficient (0.0336). Baltic autumn spawners had the second highest kinship coefficient (0.05827), likely due to the effect of the Fehmarn individuals, which showed high relatedness to each other. The Fehmarn within-group average was 0.08. This illustrates there is some substructure in the herring metapopulations, although it is likely that there is a degree of connectivity, as strong substructure would result in negative kinship coefficients. It should be noted that all individuals showed low levels of relatedness and there was some variance in relatedness between all groups. The Transition Zone showed the highest variation in kinship coefficient, likely due to the presence of subpopulations such as the Idefjørd herring.
ROHs were compared by length, count, and total sum between the 4 Baltic populations.

5' C to T Transitions 3' G to A Transitions
Position Position Figure S1 -Fragment Misincorporation and ancient DNA Damage plots. a) Patterns were obtained by using MapDamage v. 2.0.6 after down-sampling BAM files to 1 million reads. These plots show the typical fragmentation and the postmortem deamination patterns that are characteristic of those associated with ancient DNA 23 . We simultaneously plot all archaeological herring specimens (n=40) b) Fragment misincorporation and damage plots for modern herring sequence data (n=72).  Figure S2 -Exploratory population analyses of Atlantic herring using genome-wide data. The PCA is based on 10,368,446 SNPs. The color indicates the spawning season and the shape indicates the sea in which the sample was collected. "Transition" refers to the area between Norway, Sweden, and Denmark that spans the transition between the North Sea and the Baltic Sea. Three Atlantic herring specimens are located away from the main herring clusters. This pattern is driven by inclusion of a single specimen (HER_Z12_IsleOfMan) that is contaminated (see also Figure S3, S4, and S5).  , excluding eight modern herring specimens that were used as a test dataset with known metadata. b) Application of BAMscorer to the bootstrapped downsampled alignment files of the eight test specimens with known metadata show greater difficulty assigning spring spawning than autumn spawning. Each individual BAM file from the training dataset was randomly down-sampled to between 500 and 100,000 reads for 20 iterations. The y-axis represents the proportion of those files which are known to be type AA or type BB that were assigned correctly using BAMscorer. Both seasons can be assigned with no error with a minimum of 50,000 reads. AA is autumn spawning and BB type is spring spawning. c) The inversion locus within autumn spawners can be scored with as few as 5000 reads. Three of the eight test specimens were autumn spawners, one Baltic type and two Atlantic type (Supplementary Dataset S2). Each individual BAM file from the was randomly down-sampled to between 500 and 100,000 reads for 20 iterations. The y-axis represents the proportion of those files which are known to be type AA or type BB that were assigned correctly using BAMscorer.  Figure S8 -Salinity adaptation loci BAMscorer database creation and parameter testing. a) PCA generated with smartPCA. Color indicates ocean basin as a proxy for salinity (high being Atlantic, low being Baltic) and shape indicates population of origin. The PCA is based on 2303 divergent SNPs obtained from Han et al. 9 Some Atlantic samples were from Norwegian fjords with low salinity therefore they fell between the two groups. These samples were removed from the dataset for the final scoring. b) Eight individuals with known metadata were removed from the reference dataset for sensitivity analysis (Supplementary Dataset S2). Iterative application of BAMscorer to down-sampled alignment files showing a minimum of 50,000 reads is required for determining salinity adaptation. Each individual BAM file from the eight test specimens training dataset was randomly down-sampled to between 500 and 100,000 reads for 20 iterations. The y-axis represents the proportion of those files which are known to be type AA or type BB that were assigned correctly using BAMscorer.  They increase during the LIA and rapidly decline during the period of known autumn spawning population collapse coinciding with the increase of the sprat (Sprattus sprattus) population; c) Central spring spawners show an increase around the time of the decline of the western autumn spawners, then a decrease again at the end of the LIA as well as another dramatic decrease around the time of the autumn spawners' collapse ~100YBP; d) Gulf spring spawners decrease during the LIA and then increase dramatically at the end of the LIA, starting to decline only in very recent generations when fisheries in the gulfs intensify. This panel illustrates the accuracy of using generation times between 2-5 years for Baltic herring. GSS were not intensively fished until very recently, therefore their demographic trajectory should reflect changes in environmental factors such as sea surface temperature and presence of ice cover. Generation times 2 and 5 reflect best the expected changes in population size given known responses to such environmental changes in Baltic herring. Here, Baltic spring spawners were grouped as a single metapopulation (n=10) and gone was run with the same parameters as before. This figure shows the projected demographic trajectory when both the GSS and CBSS are treated as a single population. Historical events are again marked, illustrating the relationship between fishing pressure, changing climate, and demographic trends in Baltic herring.