Population changes in a whale breeding ground revealed by citizen science noninvasive genetics

Historical exploitation, and a combination of current anthropogenic impacts, such as climate change and habitat degradation, impact the population dynamics of marine mammalian megafauna. Right whales ( Eubalaena spp.) are large cetaceans recovering from hunting, whose reproductive and population growth rate appear to be impacted by climate change. We apply noninvasive genetic methods to monitor southern right whale ( E. australis , SRW) and test the application of noninvasive genetics to minimise the observer effects on the population. Our aim is to describe population structure, and interdecadal and interannual changes to assess species status in the Great Acceleration period of Anthropocene. As a basis for population genetic analyses, we collected samples from sloughed skin during post-migration epidermal moult. Considering the exploration-exploitation dilemma, we collaborated with whale watching companies, as part of a citizen science approach and to reduce ad hoc logistic operations and biopsy equipment. We used mitochondrial and microsatellite data and population genetic tools. We report for the first time the genetic composition and differentiation of the Namibian portion of the range. Population genetic parameters suggest that South Africa hosts the largest population. This corresponds with higher estimates of current gene flow from Africa compared to older samples. We have observed considerable interannual variation in population density at the breeding ground and an interdecadal shift in genetic variability, evidenced by an increase in the point estimate inbreeding. Clustering analyses confirmed differentiation between the Atlantic and Indo-Pacific, presumably originating during the ice ages. We show that population monitoring of large whales, essential for their conservation management, is feasible using noninvasive sampling within non-scientific platforms. Observed patterns are


Introduction
The decline in marine megafauna abundance accelerated rapidly during the period of commercial whaling (Jackson et al., 2008).An estimated total of 150,000 southern right whales (Eubalaena australis; Desmoulins, 1822; hereafter SRWs) were killed by whalers, leading to near extinction of the species and a demographic bottleneck (Jackson et al., 2008).Since the halt of all whaling activities on SRWs in 1971 (Tormosov et al., 1998) the recovery has been relatively successful, particularly when compared to other Eubalaena species (e.g., Fujiwara and Caswell, 2001).The global population is estimated to have increased at roughly 7% per annum, reaching ca.13,600 animals by 2009 (International Whaling Commission, 2013).However, despite their continued international protection from whaling, SRWs are facing new threats from human activities, such as global climate change (e.g., Agrelo et al., 2021, van den Berg et al., 2021).For example, foraging grounds for some populations are located in regions of the Southern Ocean that have already been significantly affected by global climate change (Harcourt et al., 2019;Bestley et al., 2020).
Current population structure of large mammals may be affected by historical decline, range fragmentation and population isolation (Frankham et al., 2004), but may also be further facilitated by adaptive evolution and ecotype diversification (Foote et al., 2016).Previous analyses of mitochondrial genetic variation in SRW identified two populations corresponding to the South Atlantic and Indo-Pacific ocean basins (Patenaude et al., 2007).Differences in nuclear genetic markers between the wintering grounds have also been reported (Carroll et al., 2019), with generally higher levels of diversity in the South Atlantic than in the Indo-Pacific breeding grounds (Carroll et al., 2019).SRW mtDNA consistently shows stronger divergence than microsatellite markers, indicating female philopatry (Rowntree et al., 2001;Carroll et al., 2019;Best, 2000).
Currently, several demographically independent and genetically distinct SRW wintering grounds have been identified, including the coastal waters of; the southwestern Atlantic (Argentina and Brazil); South Africa; Australia (southeast and southwest populations with multiple nursery grounds); and the New Zealand sub-Antarctic (Best, 2000;Carroll et al., 2019).Carroll et al. (2020) suggest ongoing gene flow among these nursery grounds, possibly through shared foraging ground use facilitating gene flow (Carroll et al., 2015).For example, both photo-ID matching (Best et al., 1993) and stable isotope analysis (Rowntree et al., 2001;van den Berg et al., 2021) suggest SRWs from South Africa and Argentina share the same common feeding grounds around Tristan Da Cunha/Gough Island, Bouvet Island.Carroll et al. (2019), building on work by Patenaude et al. (2007), suggested the phylogeographic pattern observed in SRW was due to a recent increase in connectivity, after isolation and secondary contact between ocean basins.Currently, evidence suggests there is a heterogeneity of gene-flow estimates across the Southern Hemisphere.
This paper focuses on the southern African breeding ground to study details of the SRW population structure.The South African population is considered to be the largest one globally, with an estimated population of ca 6116 whales (Best, 2000;Brandão et al., 2018), while the knowledge of the Namibian part of the range is restricted to scarce sightings (Roux et al., 2015).Currently, the South African population is undergoing a pronounced shift in various demographic and reproductive parameters (Vermeulen et al., 2018).Specifically, the 1990s period was characterized by a high calving rate (Brandão et al., 2010) followed by a period of reduced calving rate and decreased population growth rate during the 2010s (Brandão et al., 2018;Roux et al., 2015).A recent study revealed a concurrent shift in the foraging ecology of the South African SRWs over the past two decades based on stable isotope analyses of skin samples (van den Berg et al., 2021), Obtaining genetic samples from marine mammals is logistically difficult.In cases of stranded or entangled marine mammals, it is possible to sample dead or live animals directly.However, such cases are scarce and unpredictable, and in the case of mortalities, the samples represent individuals that are no longer part of the population.Sampling living animals allows for consistent data collection when conducted in habitats that are used predictably, such as the nursery grounds of whales.Sampling usually involves taking skin biopsies remotely using a crossbow or dart rifle which is considered an invasive technique subject to ethical standards approval (e.g., Baker et al., 1999;Carroll et al., 2013;Valenzuela et al., 2009).Alternatively, skin samples from cetaceans can also be obtained by collecting sloughed skin found at the surface, especially after high-energy behaviours such as breaching, lobtailing, mating or vocalizing (e.g.Amos et al., 1992;Clapham et al., 1993;Valsecchi et al., 1998).
The present study focuses on SRWs and reflects the urgent need to a) obtain robust data to draw conclusions about the effects of the above processes on the population dynamics of megafauna planktivores, and b) assess the use of minimally invasive or noninvasive approaches to do this (Carroll et al., 2018).In the present paper, we aim to study: (i) Utilization of a noninvasive approach: We collect SRW sloughed skin at the South African breeding ground using a non-scientific platform (citizen scientists aboard whale watching boats) and evaluate this framework as a tool for long-term genetic monitoring of whale populations;.(ii) Population structure and dynamics of southern right whales in Africa: By adding recent population sample, we provide an update (Patenaude et al., 2007;Carroll et al., 2019Carroll et al., , 2020) ) on the population structure and genetic diversity of the African population in the post-whaling period.We also present the first-ever insight into the genetics of SRW from Namibia;.
P. Neveceralova et al. (iii) Transoceanic differentiation and gene flow: Based on these data and published, comparative genetic data from other populations (Carroll et al., 2019), we also examine the potential differentiation and phylogeographic relationships of the African population to the rest of the range;.(iv) Demography: We aim to study interannual (by comparing specific years), interdecadal (by comparing older -1990s and more recent -2010s data) and long-term (using neutrality tests and coalescent theory) demography of the African population.We hypothesize that changes in population biology occurring in response to changes during the Great Acceleration (Steffen et al., 2015) can be revealed through such an approach.

Sample collection in South Africa and Namibia
The majority of samples (n = 264) used in this study were obtained noninvasively by collecting sloughed skin from whale watching boats conducting commercial trips during the austral winters of 2016 -2018 in the area of Gansbaai and Walker Bay, South Africa.Pieces of skin were spotted in the water, picked up by a dip net and transferred with sterile tweezers to a tube containing 96% ethanol.Additional samples (n = 25) were collected from a research boat by remote biopsy using a crossbow and Cetadart darts (Lambertsen, 1987).All samples were stored at − 18 • C. Another 32 biopsy samples were available in archive held by University of Pretoria Mammal Research Institute Whale Unit.These samples were collected in two different regions, South Africa (n = 30) and Namibia (n = 2), between 2003 and 2013 (Supplementary Material: Table S1).Furthermore, 39 samples from 2016 and 30 older South African samples, previously analysed and published in Carroll et al. (2020), were reanalysed here to help ensure consistent genotyping procedures (see below).

DNA extraction and DNA profile construction
Tissue was pulverised in liquid nitrogen and DNA was extracted using either the QIAGEN DNeasy Blood & Tissue Kit or the GENEAID Genomic DNA Mini Kit.DNA profiles, comprising multi-locus microsatellite genotype, genetically identified sex and mitochondrial control region sequence were then constructed for each sample.Seventeen microsatellite loci were grouped into multiplexes and amplified in 10 μl PCR reactions (Carroll et al., 2015; Supplementary Material: Table S2).Multi-locus microsatellite genotyping was done according to sample type.For noninvasive samples, a multi-tube approach (Taberlet et al., 1996) was attempted, with each DNA extraction being amplified at all loci up to three times.For biopsy samples, all loci were amplified once.To ensure consistency with previously generated datasets, a set of 30 samples that were previously analysed in Carroll et al. (2020) were also amplified and genotyped.
Sex was determined by amplification of the male specific SRY gene, multiplexed with an amplification of the ZFY/ZRX region as a positive control (Aasen and Medrano, 1990;Gilson et al., 1998).An approximately 550 base pair fragment of the left hypervariable domain of mtDNA control region adjacent to the Pro-tRNA gene was amplified according to Baker et al. (1999).The resulting PCR product was purified by either QIAGEN QIAquick PCR Purification Kit or GENEAID GenepHlow PCR Cleanup Kit and sequenced using BigDye chemistry on a 3130 Genetic Analyzer (Applied Biosystems).Chromatograms were visualized and edited in Geneious Prime v2020.2.4 (© Biomatters Ltd.).
For the microsatellite data, quality filtering, allele calling and binning was performed in the program GENEMAPPER 5 (Applied Biosystems).For samples where loci were run more than once, the final genotype was constructed by choosing the highest quality allele calls for each locus, as determined by the quality score in GENEMAPPER.Any samples where the allele calls disagreed between runs were removed from the dataset.Genotypes that failed to amplify for seven or more loci were considered poor quality and were removed from the dataset.Genotype error rates were calculated per allele (Pompanon et al., 2005) using the internal control samples amplified in every PCR and replicate samples.We report the completeness of the final dataset in terms of number of loci per sample.

Generating the African DNA profile catalogue
First, genotypes within a year were reconciled to identify the number of unique whales sampled per austral winter season.Then, unique genotypes across years were compared to understand between year recaptures and the total number of whales sampled over the survey period.CERVUS 3.0.7 was used to identify these within and between years genotype matches (Kalinowski et al., 2007) with the minimum number of matching loci set to at least eight.Pairs of genotypes that matched at eight loci but mismatched at up to three loci were reviewed and repeated if necessary to verify the individual's identity or difference (Constantine et al., 2012).

Integration with published data and structuring of dataset
The data generated for this study were augmented with published mtDNA sequence (n = 1355) and microsatellite genotype (n = 222) data from South Africa, South America and the Indo-Pacific (Carroll et al., 2019; https://doi.org/10.5061/dryad.vv5347p),resulting in a total dataset of 410 individuals with microsatellite genotypes and 1602 individuals with mtDNA control region sequences (https://doi.org/10.5061/dryad.pvmcvdnnb) .In the present study, a priori definition of populations was used, concordant with a population genetic approach (Waples and Gaggiotti, 2006).Geographic sorting comprised population samples from coastal waters of Africa (South Africa and Namibia) and comparative data from coastal waters of South America (Argentina) and Indo-Pacific (Australia P. Neveceralova et al. and New Zealand) (Fig. 1).Considering the longevity within the family Balaenidae, including up to 268 years in bowhead whales (Balaena mysticetus) (Mayne et al., 2019), recent and earlier samples were pooled into one dataset for range-wide phylogeographic and demographic analyses.The South African dataset was sorted according to A) geography (sampling locality, Fig. 1) and B) chronology (recent samples from South Africa -2010s as listed in Supplementary Material: Table S1, reduced breeding success; earlier samples from South Africa -1990s;published in Carroll et al., 2019, high breeding success).

Population structure of African population and transoceanic phylogeographic relationships
Potential null allele frequency was calculated for each locus and each population using FreeNA software (Chapuis and Estoup, 2007) with 100,000 replicates.Genetic distances parameters like F ST and F IS were computed in FSTAT 2.9.4 (Goudet, 1995) using Weir and Cockerham (1984) and other nuclear population genetic parameters (h, gene diversity; N e , number of effective alleles; I, Shannon's information index; H o , observed heterozygosity; H e , expected heterozygosity and PA, number of private alleles) were calculated using GenAIEx 6.503 (Peakall and Smouse, 2006).Interdecadal and interannual comparisons were performed for particular measures of variability for the African population.
Population structure was explored using Bayesian clustering of individual genotypes as implemented in the program Structure 2.3.4 (Pritchard et al., 2000;Falush et al., 2007) and by using a nested approach.This analysis was performed on African datasets sorted geographically and chronologically both with and without sampling location priors, and on the global dataset.Each analysis was set to 10 6 MCMC reps after a 10 5 burn-in period and repeated 10 times for each value of K, which varied from one to five, with the admixture model and allele frequencies correlated with populations.Values for each K were combined in the program Clummp (Jakobsson and Rosenberg, 2007).The statistical support for the results of each K was analysed using the program Structure Selector (Li and Liu, 2018) using statistics MedMed K, MedMean K, MaxMed K and MaxMean K (Puechmaille, 2016) and ΔK (Evanno et al., 2005).(International Whaling Commission, 2020;Best, 2007;Zerbini et al., 2018) estimates of gene flow between marked breeding grounds from BayesAss (grey arrows, length of the arrows represents approximate values and numbers exact values).Putative summer feeding or foraging grounds during austral summer months are displayed by red hatch pattern (Best, 2007;Zerbini et al., 2018;Weir and Stanworth, 2020;Smith et al., 2012;Mackay et al., 2020;Carroll et al., 2020;Baines and Weir, 2020;van den Berg et al., 2021).Heatmap shows the summer near-surface chlorophyll a concentration (according to Deppeler and Davidson, 2017), a useful proxy of biomass for primary production.The map on the right is showing sampling sites in southern Africa and the number of collected samples in particular locations.
P. Neveceralova et al.Population structure was further analysed using a Discriminant Analysis of Principal Components (DAPC) implemented in the R package ADEGENET (Jombart, 2008), run in RStudio (https://rstudio.com/).First, patterns of genetic structure were investigated by carrying out DAPC using the African dataset with the aforementioned sample partitions; (A) geographic; (B) decadal (1990s or 2010s), with this information as a prior (Jombart et al., 2010), then on (C) the global dataset with wintering ground location of sampling set as the prior.Second, both African datasets, sorted geographically and chronologically, and the global dataset, were explored for cryptic variation using the find.clustersfunction (de novo).This function runs successive K-means clustering with increasing number of clusters (K) to assess the number of clusters which maximize between group variance and minimize within group variance (Jombart et al., 2010).For selecting the optimal K, the Bayesian Information Criterion (BIC) was applied for assessing the best supported model.Finally, DAPC was performed on the pre-defined clusters using the dapc function.To not overfit the discriminant function, we chose the optimal number of principal components for the DAPC using the optim.a.score function.
Contemporary effective migration rate (m) among (a) ocean basins (South Atlantic and Indo-Pacific); b) three populations (Africa; South America; Indo-Pacific) was estimated in BAYESASS 3.0.4(Wilson and Rannala, 2003).Twelve independent runs with different seeds were performed with a burn-in of 5 × 10 6 iterations followed by 5 × 10 7 iterations, sampling every 2000 iterations; all other parameters were kept at their default values (0.10).The resulting traces were visually checked for convergence in Tracer v1.7.1 (Rambaut et al., 2018) and different runs were checked for consistency.
Sequences of mtDNA were aligned using the MUSCLE 3.8.31(Edgar, 2004) alignment tool in GENEIOUS 9.1.8and cropped to 381 bp.Φ ST for mtDNA was estimated using ARLEQUIN 3.5 (Excoffier and Lischer, 2010) and statistical significance was assessed in the same program.Other mtDNA parameters (Nh, number of haplotypes; Hd, haplotype diversity; Np, number of polymorphic (segregating) sites; π, nucleotide diversity; D, Tajima's D; Fs, Fu's Fs statistics; PH, number of private haplotypes), were calculated using DNASP 6.10.01 (Rozas et al., 2003).P-distance was calculated using MEGA software (Tamura et al., 2021).A mitochondrial haplotype network was constructed using the Median Joining method in NETWORK 5.0.1.1.(Bandelt et al., 1999) for the global dataset and for South African samples to compare mitochondrial structure between 1990s and 2010s

Demography
To show changes in effective population size over time, we used a coalescent approach using the available mtDNA sequence data analysed in this study.Bayesian Skyline Plots were computed in BEAST 2 (Drummond and Rambaut, 2007).Considering limits of the power of the method, the pooled sampling strategy used in present study was evaluated as balanced sensu Heller et al. (2013).To approximate the assumption of panmixia (Heller et al., 2013) and given the probable diversification of demographic histories in particular parts of the range (see differences in neutrality tests and Carroll et al., 2019), the analyses were performed separately for particular populations (Africa, South America, Australia, New Zealand).The model of sequence evolution was ascertained following the Akaike criterion in MRMODELTEST2 (Nylander, 2004).A strict molecular clock rate (6.78 × 10 − 3 substitutions/site/My; 95% HPD: 0.00604, 0.00746) was ascertained (Dornburg et al., 2012).The Markov chain Monte Carlo simulations were run with 1 × 10 8 iterations three times for each population, while genealogies and model parameters were sampled each 1000 iterations.The first 10% iterations of each run were discarded as burn-in.Mixing and convergence were assessed by visualising the traces in Tracer v1.7.1 (Rambaut et al., 2018) and combining the remaining results in LOGCOMBINER to assess the effective sample size.

Efficiency and genotyping success of sloughed skin and biopsy samples
In total, 264 sloughed skin (noninvasive) samples and 61 biopsy samples were collected.After quality filtering, this resulted in 233 genotypes that passed QC (Supplementary Material: Table S3).Of the 264 sloughed skin samples, 179 or 67.8% produced genotypes that passed QC, compared with 55 or 90.2% of skin biopsy samples.Sloughed skin and biopsy samples produced similarly complete genotypes, but sloughed skin samples had a higher error rate (Table 1).The types of error were also different; sloughed skin genotyping errors were 53.66% allelic drop-out and 46.34% false alleles, while biopsy sample genotyping errors were 25% allelic drop-out and 75% false alleles.Due to the presence of a moderate frequency of null alleles (x < 0.20) (Supplementary Material: Table S4), the locus TR3G1 was discarded.Due to the low genotyping success, loci EV37 (65.26%) and RW48 (67.89%) were also removed from the dataset and all subsequent analyses were performed with 14 loci.

Genotype catalogue and recaptures
Comparison of genotypes within years showed 189 unique genotypes, assumed to represent unique whales, sampled using noninvasive (n = 140) or biopsy (n = 49) methods in the African population (Supplementary Material: Table S3).Almost half of the recaptures from noninvasive sampling (n = 21/43) within a year were on one day when there were several whale watching trips.There were eight whales re-sampled across years, as far apart as decades (Supplementary Material: Fig. S1; Supplementary Material: Table S5).There was substantial interannual variation in the number of collected samples (Supplementary Material: Table S3).
Comparison of genotypes across sampling regions in South Africa revealed two whales sampled in more than one location.Of the 189 samples that amplified for sex, 52.93% (n = 100) were males and 47.07%(n = 89) were females, i.e., the sex ratio was 1.13 (Supplementary Material: Table S3; Supplementary Material: Table S1).

Differences across regions and time periods in Africa
Here we assessed whether there were differences within the African dataset linked to A) geography and B) decades.Analysis (A) was based on the 189 samples collected in the 2010s (Fig. 1; Supplementary Material: Table S1) across the coast of southern Africa and presented here, a subset of which were previously analysed by Carroll et al. (2019).Analysis (B) used the samples from (A) to represent the 2010s and samples from South Africa collected in the 1990s and published by Carroll et al. (2019), for a combined total 235 individuals.

Geography
Pairwise microsatellite F ST values between particular African regions (Supplementary Material: Table S6) ranged from 0 to 0.0058, however, none of these values were statistically significant.STRUCTURE analyses of the recent African population (dataset A) using geographic information as a prior showed parameters MedMed K, MedMean K, MaxMed K and MaxMean K= 3 and ΔK= 2 (Fig. 2a).STRUCTURE analysis for the same dataset with no prior population settings also gave ΔK= 2. The DAPC scatterplot of the dataset (A) with a priori grouping showed two different clusters.First, the samples from South Africa collected in different regions overlapped each other, with the two Namibian samples forming a second distinct cluster (Fig. 2b).The BIC model in DAPC analysis for the same dataset using the find.clustersfunction suggested K= 4 as the best supported model for the African population.The ϕ ST ranged from 0 to 0.141, none of these values were statistically significant (Supplementary Material: Table S6b).The mtDNA haplotype network (Supplementary Material: Fig. S2) consisted of two haplogroups as previously described (Patenaude et al., 2007).

Decadal comparison
Interdecadal comparison revealed a higher inbreeding rate (F IS ), although confidence intervals for these estimates overlap (Table 2a), and slightly reduced haplotype variability in the 2010s decade (Table 2b).The microsatellite-based pairwise F ST between the 1990s and 2010s was 0.0002 (p-value > 0.05), while the mtDNA based ϕ ST was 0.00060 (p-value > 0.05).
The programme STRUCTURE, used to compare South African samples from the 1990s and 2010s with the decade information set as a prior, showed parameters MedMed K, MedMean K, MaxMed K and MaxMean K = 1 while ΔK = 2 (Supplementary Material: Fig. S3a).STRUCTURE analysis for this dataset for South Africa with no prior population settings also gave a ΔK = 2 (Supplementary Material: Fig. S4a).DAPC of this dataset with decades as a prior setting separates individuals into two clusters, however, those clusters overlap, indicating minimal population differentiation (Supplementary Material: Fig. S3b).The BIC model in DAPC analysis for dataset C using the find.clustersfunction suggested that K = 4 as a best supported model for the South African population.Out of four clusters, one was a cluster of 49 samples from the 2010s only, the other clusters were a mix of samples from both decades.The Fu's Fs statistics differ within the 1990s and 2010s periods in South Africa as this parameter decreases with the time (Table 2b).

Phylogeography and demography
Analysis by STRUCTURE with sampling locality as a prior showed differentiation between the Atlantic and the Indo-Pacific ocean basins with ΔK= 2, parameters MedMed K, MedMean K= 2 and parameters MaxMed K and MaxMean K= 3 (Fig. 3a).STRUCTURE with no  locprior settings showed ΔK= 2 (Supplementary Material: Fig. S4b).The scatterplot of a priori DAPC analysis supports the Atlantic -Indo-Pacific differentiation (Fig. 3b).Global F ST using ENA was 0.008539 and pairwise comparisons show F ST values between the Indo-Pacific, South America and Africa were low but statistically significant (F ST = 0.002 -0.011, p < 0.05, Supplementary Material: Table S7).The microsatellite diversity is very similar between all compared populations with the exception of F IS in the African population which is higher than in the other populations (Table 2a).There was also a higher number of private alleles in the African wintering ground (Table 2).Namibia was not assessed for diversity or differentiation statistics separately due to the small sample size (n = 2).Inspection of the traces for all twelve runs of BAYESASS indicated that convergence was achieved with effective sample sizes for all parameters on each run > 10,000.The migration rate estimates were consistent across runs.Genetic estimates of dispersal rates using BAYESASS comparing either two regions (South Atlantic vs. Indo-Pacific) or three regions (Africa, South America and Indo-Pacific) revealed a higher estimated gene flow from the South Atlantic to Indo-Pacific (0.3035, 95% HPDs: 0.2814, 0.3316) than in the opposite direction (0.0031, 95% HPDs: < 0.001, < 0.001) (Fig. 1; Supplementary Material: Table S8).
The topology of the haplotype network showed two haplogroups with a certain degree of geographic structure (Supplementary Material: Fig. S5).The two Namibian haplotypes are grouped with the South African haplotypes.P-distance within the species ranged from 0.003 to 0.047.Of the 61 total haplotypes, 24 were private to the African population, 14 were private for the Southern American population and three for the Indo-Pacific.As previously described, the South African population has the highest number of haplotypes and haplotype diversity and the Indo-Pacific the lowest.There were four haplotypes shared between animals sampled in Africa, South America and the Indo-Pacific.There were two haplotypes (ValHapD; ValHapP), previously found only in either the South American or Indo-Pacific populations that were now found in African populations.There is a non-significant difference in Fu´s FS statistics between populations, it is negative in the African population, above zero in the South American population and high in the Indo-Pacific population (Table 2b).The Bayesian Skyline Plot analyses revealed a decline of effective population size in examined populations at the end of Pleistocene/Holocene (Fig. 4, Supplementary Material: Fig. S6).

Utilization of a noninvasive approach
Noninvasive genetic research has been conducted on several marine mammal species (Caballero et al., 2001;Gendron and Mesnick, 2001;Drouot et al., 2004), however, it still carries many difficulties compared to terrestrial and freshwater habitats (Goossens and Bruford, 2009).Here, we present an extensive noninvasive study performed on a SRW population, proving the utility of this approach in conservation genetic studies within a citizen science framework involving whale watching crews.We show that data collected in this way can be used to generate a DNA profile catalogue that can be used to understand identity, philopatry and genetic diversity of whale migratory habitats.
Collecting noninvasive samples from whale watching platforms seems to be efficient with a total of 264 sloughed skin samples collected during this study.However, subsequent genotyping identified several same-day recaptures (n = 21), particularly on days when there were several whale watching trips.DNA extracted from noninvasive samples had relatively high genotyping success (67.8% in microsatellite genotyping, of which 98.5% sex determination of already genotyped samples and 94.6% in mitochondrial marker).Our results are high compared to other types of noninvasive samples, for example scats (typical genotyping success around 50%; Beja-Pereira et al., 2009;Carroll et al., 2018;Waits and Paetkau, 2005), although methods vary substantially among studies, complicating direct comparison.Sampling success in particular years correlates with the substantial interannual variation in numbers of whale sighted in the South African breeding grounds between 2016 and 2018 (Vermeulen et al., 2018;Vermeulen et al., 2020).Here we report data from years with the most pronounced interannual abundance fluctuations between 2016 and 2018.This is an important aspect in planning logistics of noninvasive sampling.The abovementioned trend is correlated with a shift in a calving interval, from three-year to four-or five-year calving interval (Vermeulen et al., 2020) and interpreted as resulting from calving failure (Knowlton et al., 1994).The main disadvantage of sloughed skin sampling is the lack of ability to link genotypes to other data types, for example, long-term reproductive and sighting records for South African SRWs (Vermeulen et al., 2018;Brandão et al., 2018).As photo-IDs or even temporary field IDs of individuals reduces the likelihood of resampling, the decoupling of sample from ID likely also contributed to higher within-year resampling and therefore an increase to downstream analysis costs.
Furthermore, sloughed skin has been shed by the whale, and so will have a different epigenetic profile than biopsy samples taken from living whales, meaning epigenetic assays such as molecular age biomarkers (e.g.Polanowski et al., 2014) cannot be used with such samples.Biopsy samples also often come with a small amount of blubber that can be used in hormone assays (Hunt et al., 2013;Carone et al., 2019).However, stable isotope analyses can be successfully conducted on sloughed skin samples (Witteveen et al., 2009;Busquets-Vass et al., 2017;Marcoux et al., 2007).
Noninvasive sampling of sloughed skin has advantages in that it does not require specialised training, equipment or permits that are necessary for remote biopsy sampling (Best et al., 2005;Whitehead et al., 1990;Brown et al., 1991), and offers the opportunity for samples to be collected by for example multiple whale watching operators along the coast simultaneously.Regarding observer effect, SRW behaviour can be impacted by vessel noise and close boat approaches.For example, controlled vessel approach experiments have documented that cows with calves are the most sensitive demographic classes to disturbance, starting to react from more than 1 km away (Lundquist, 2003;Barrett, 2000).Therefore, the use of whale watching boats depends on local regulations (Argüelles et al., 2016), but of course is preferable to whaling for the sustainable existence of marine megafaunal populations at circumstances of the market-driven world of the 21st century, i.e. from the point of view of exploration-exploitation dilemma.The present approach utilizes these platforms that already engage with the whales rather than adding additional boat operations.

Population structure and dynamics of southern right whales in Africa
African population contains a large proportion of the species genetic variance, as shown by mtDNA diversity (cf.Carroll et al., 2020) and de novo DAPC cluster analysis.Variation of mitochondrial and nuclear markers show affinities of whales from Namibia to the South African wintering ground, which supports the hypothesis that Namibian whales represent range expansion from South Africa, suggested by the photo ID matches with South African whales (Roux et al., 2015).In contrast, DAPC analysis of the African dataset with prior settings suggests genetic differentiation of Namibian whales from South African (Fig. 2b).This could support the hypotheses that SRWs in Namibia are remnants of an originally separate stock (Roux et al., 2015).These hypotheses should be investigated by additional sampling and further research.
This study showed a relatively high male-to-female sex ratio (1.13) compared to other datasets from the 2010s (0.83 in Australia, Carroll et al., 2015), 2000s (0.91 in New Zealand, Carroll et al., 2013) and 1990s (1.07 in New Zealand, Carroll et al., 2013).The male-biased sex ratio could reduce the effective population size.However, a high proportion of sampling near surface-active groups dominated by males (Best et al., 2003) could bias the calculated sex ratio in certain studies.Females involved in these mating groups are frequently immature and have not been seen with a calf in the following year (Best et al., 2003), so for that reason it is believed that SRWs mate in the feeding grounds or during migration as well (Carroll et al., 2015).Therefore, sex ratio may differ in particular demes.
Genetic clustering and weak and nominally significant FST within South Africa suggested that there is minimal difference between samples collected in the 1990s and 2010s

Transoceanic differentiation and gene flow
Pleistocene diversification of the species, possibly related to isolation of oceanic basins during glacial periods, is indicated by presence of two main mitochondrial haplogroups and nuclear genetic distances (Patenaude et al., 2007;Carroll et al., 2019).Two main nuclear genetic clusters (the Indo-Pacific and South Atlantic) support this hypothesis.It is likely that philopatry to wintering grounds has contributed to limited gene flow between South American/African and Indo-Pacific populations.However, these results should be interpreted with caution given the samples from different wintering grounds come from different time periods and given limits of de novo group designation in DAPC (Miller et al., 2020).DAPC with prior settings by population indicate differentiation of South Atlantic and Indo-Pacific samples (Carroll et al., 2019).
Shared feeding grounds and long-distance movements provide a possible mechanism for gene flow between South Atlantic populations (Carroll et al., 2019(Carroll et al., , 2020)).The migration rate from the Indo-Pacific to the South Atlantic estimated in this study using BAYESASS differs from the results in Carroll et al. (2019), with reported gene flow of 0.028 (95% HPDs: < 0.001, 0.068) from the Indo-Pacific to the South Atlantic and 0.038 (95% HPDs: 0.006, 0.083) in the opposite direction.Our study found gene flow from the Indo-Pacific to the South Atlantic of 0.0031 (95% HPDs: < 0.001, < 0.001) and much higher in the opposite direction: 0.3035 (95% HPDs: 0.2814, 0.3316).When comparing three populations (Africa, South America and Indo-Pacific), gene flow from Africa to Indo-Pacific is 0.3035 -a threefold higher migration rate than described in Carroll et al. (2019).The present results, indicating higher recent migration from Africa compared to historical data, could be related to population recovery of the South African population (Carroll et al., 2019), and/or a shift in foraging strategy (e.g., van den Berg et al., 2021) increasing connectivity between the South African and the Indo-Pacific during the same period.As with the previous estimates of connectivity, BayesAss results should be interpreted with caution considering potential deviations from method assumptions including constant population size and migration rates through time.

Demography
Contraction of SRW population can be assumed considering the excess of high-frequency polymorphism, as formalised by the positive values of Tajima´s D. The relatively low values for the South African population indicate the importance of this area for the species recovery.Both Tajima´s D and Fu´s Fs were not significant, likely due to the small number of markers.The shape of the BSP and values of neutrality tests probably indicates higher effective population size during Pleistocene and population decline toward Holocene, the scenario proposed for several baleen whales such as Antarctic blue whales (Balaenoptera musculus intermedia) (Attard et al., 2015) and bowhead whales (Balaena mysticetus) (Phillips et al., 2013).In SRW, Carroll et al. (2019) suggests that the sea-level rise following the LGM prompted increased dispersal, so it may be that changes in the BSP reflect fluctuations in effective population size related to connectivity rather than changes in census abundance.Although the level of gene flow, presence of population substructure and sampling strategy could affect the BSP method, the observed signal is comparable to Holocene decline scenario plots in simulation studies under various models (Heller et al., 2013).
Present data indicate an increase in inbreeding rate, illustrated by almost double the increase of the point estimate inbreeding coefficient (F IS ) of 2010s South African population in comparison with 1990s.The values of F IS based on microsatellite loci are relatively high, with the 2010 South African value comparable with other cetacean species with uncertain conservation status such as blue whale (Balaenoptera musculus) with F IS of 0.062 (Torres-Florez et al., 2014).However, the confidence intervals for these estimates overlap, and there is an increased rate of allelic dropout in the sloughed skin samples, which form the majority of the 2010s samples.Inbreeding can lead into reductions in heterozygosity, reduced fitness and to increased risk of extinction (Frankham et al., 2004).On the other hand, impact of higher levels of inbreeding among North Atlantic right whales (E.glacialis) was not detected (Radvan, 2019), potentially due to post copulatory selection towards increasing heterozygosity (Frasier et al., 2013) or due to marker limitations.Therefore, future studies should use genomic tools to assess for example runs of homozygosity or changes in functional genes (Leroy et al., 2018).
The changes in zooplankton availability related to the Anthropocene are often mirrored by changes in movement ecology and reproductive biology in filter feeders (van den Berg et al., 2021).For example, it is possible that SRW are more widely dispersed searching for food, potentially year-round in South Africa given the lack of unaccompanied adults returning to this wintering ground (Best, 2000).Given that there is likely a maternally learned component of foraging ground behaviour (Valenzuela et al., 2009), with more genetically similar whales more likely to foraging in the same region (Carroll et al., 2015), such prolonged foraging behaviour could lead to inbreeding if it extended into the winter breeding season.

Conclusions
Marine megafaunal species represent an iconic object for conservation biology.Populations of great whales were historically affected by industrial overexploitation, while substantial newer threats are connected with ship strikes, other anthropogenic stressors and environmental issues.Both periods heavily affected slowly swimming planktivores that were easily accessible to whalers in both their breeding grounds and the Southern Ocean, which is undergoing rapid change due to cumulative anthropogenic impacts (Agrelo et al., 2021).Therefore, cetology should be a pioneer of low-impact methods.The current study presents the applicability of noninvasive genetics realised via citizen science in studying large marine mammals and their potential recovery, which is crucial considering their keystone ecological status and ecosystem services (Roman et al., 2014).We have demonstrated the power of such an approach during monitoring changes in the African population of southern right whale, including inbreeding, spatial behaviour and gene flow, presumably linked to habitat and climate changes in the Southern Ocean.

Fig. 1 .
Fig. 1.Map showing the number of southern right whale (Eubalaena australis) unique genotypes used in the present study, location of southern right whale wintering breeding habitats (black hatch pattern) during austral winter months(International Whaling Commission, 2020; Best, 2007;Zerbini et al., 2018) estimates of gene flow between marked breeding grounds from BayesAss (grey arrows, length of the arrows represents approximate values and numbers exact values).Putative summer feeding or foraging grounds during austral summer months are displayed by red hatch pattern(Best, 2007;Zerbini et al., 2018;Weir and Stanworth, 2020;Smith et al., 2012;Mackay et al., 2020;Carroll et al., 2020;Baines and Weir, 2020; van den Berg et al., 2021).Heatmap shows the summer near-surface chlorophyll a concentration (according toDeppeler and Davidson, 2017), a useful proxy of biomass for primary production.The map on the right is showing sampling sites in southern Africa and the number of collected samples in particular locations.

Fig. 2 .
Fig. 2. Clustering of Eubalaena australis nuclear microsatellite genotypes based on 14 loci (a) for the 2010s African population (189 individuals; SA = South Africa, Nam = Namibia).Each genotype is denoted as a vertical bar divided into K colour segments.Proportions of colour segments are equivalent to Structure q values; (b) for the 2010s African population based on 189 individuals.SA = South Africa; Nam = Namibia.

Fig. 3 .
Fig. 3. Clustering of Eubalaena australis nuclear microsatellite genotypes based on 14 loci (a) Bayesian clustering in Structure for global dataset (410 individuals) and (b) DAPC of the global dataset based on 410 samples.

Fig. 4 .
Fig. 4. Bayesian Skyline Plots of mtDNA (left hypervariable domain of control region) for Eubalaena australis for African population based on 597 sequences.HKY + I + G model of sequence evolution was used; ESS = 390.The graph displays changes in effective population size (N e µ) over time (Mya).The dark blue line depicts the median estimate, and the margins of the blue area represent the highest 95% posterior density intervals.

P
.Neveceralova et al.

Table 1
Genotyping success of different type of samples of Eubalaena australis.

Table 2
Polymorphism and demographic characteristics of Eubalaena australis a) based on 14 microsatellite loci and b) based on 1602 mtDNA sequences (left hypervariable domain of control region).