Combining Bayesian age models and genetics to investigate population dynamics and extinction of the last mammoths in northern Siberia

be challenging when working with poorly resolved paleontological data sets. One approach to increase the data resolution is by combining different methods. In this study, we used both radiocarbon and genetic data to reconstruct the population history and extinction dynamics of the woolly mammoth in northern Siberia. We generated 88 new radiocarbon dates and combined these with previously published dates from 626 specimens to construct Bayesian age models. These models show that mammoths disappeared on the eastern Siberian mainland before the onset of the Younger Dryas (12.9 e 11.7 ky cal BP). Mammoths did however persist in the northernmost parts of central and western Siberia until the early Holocene. Further genetic results of 131 high quality mitogenomes, including 22 new mitogenomes generated in this study, support the hypothesis that mammoths from, or closely related to, a central and/or west-Siberian population recolonized Wrangel Island over the now sub-merged northern Siberian plains. As mammoths became trapped on the island due to rising sea levels, they lived another ca. 6000 years on Wrangel Island before eventually going extinct ca. 4000 years ago. © 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
The Late Pleistocene was characterized by the rapid and worldwide decline of megafauna.The primary cause of these extinctions remains uncertain, and the debate generally revolves around the relative impact of climate and humans.Over recent decades, many methods have been used to study these extinction events, including radiocarbon data, ancient DNA, stable isotope analysis, and microscopy (see Swift et al., 2019 for a review).Commonly used are radiocarbon chronologies, which can be used to reconstruct the presence and absence of a group of organisms in time, and ancient DNA, which can be used to study population movements, demography and species composition through time.However, these methods are often compromised by low-resolution and/or missing data, and only solve part of the puzzle.By combining different methods we can gain a more fine-grained understanding of the processes and dynamics leading up to the extinction of a species (Swift et al., 2019).Although the field is evolving towards a more integrative approach, studies combining different data types remain relatively rare to date (but see e.g.Immel et al., 2015;Kosintsev et al., 2019;Metcalf et al., 2016;Terlato et al., 2019).
One of the best-studied extinct megafaunal species is the woolly mammoth (Mammuthus primigenius), for which a large number of radiocarbon dated and georeferenced specimens are available.During the Last Glacial Maximum (LGM, 28.6e22.5 thousand calendar years before present [ky cal BP]) its distribution stretched from Eurasia to North America (Lister and Sher, 2015), but between then and the Late Glacial (beginning 14.7 ky cal BP) it went through a series of successive local extinction and partial recolonization events (Kuzmin, 2010;Nikolskiy et al., 2011;MacDonald et al., 2012;Markova et al., 2013;Puzachenko et al., 2017;Stuart et al., 2004).By the Younger Dryas (12.9e11.7 ky cal BP) mammoths were confined to refugia in the northernmost parts of Siberia (Lister and Stuart, 2008;Nikolskiy et al., 2011) and northern Europe (Stuart et al., 2002), and by ca.9.5 ky cal BP in the early Holocene, woolly mammoths were extinct across mainland Eurasia and North America.It did however survive into the Holocene isolated on two islands in the Bering and Chukchi sea e St. Paul Island and Wrangel Island, respectively (Guthrie, 2004;Vartanyan et al, 1993Vartanyan et al, , 1995)).The youngest known continental mammoth samples originate from New Siberian Islands (then part of the mainland) and Taimyr Peninsula (Kuzmin et al., 2003;Nikolskiy et al., 2011) and it has been suggested that mammoths from these regions recolonized Wrangel Island (Pe cnerov a et al., 2017).
Wrangel Island lost its connection to the mainland ca.10.5e10 ky cal BP (Arppe et al., 2009;Vartanyan et al., 2008) following global rising sea levels at the end of the Pleistocene.Mammoth remains on the island have been dated to the period between at least 40 ky cal BP until ca.14 ky cal BP, around the same time as the latest remains have been found on the adjacent Eastern Siberian mainland (Vartanyan, 2013;Vartanyan et al., 2008).The radiocarbon record on Wrangel Island starts again ca. 10 ky cal BP and is more or less continuous until ca. 4 ky cal BP (Vartanyan et al., 2008).Whether the observed gap in the fossil record through the Late Glacial and earliest Holocene is due to a local extinction/ recolonization event, sampling bias, or a population bottleneck remains debated (Guthrie, 2004;Nystr€ om et al., 2010;Vartanyan et al., 2008).A previous study on complete woolly mammoth mitogenomes did however find that the Holocene Wrangel Island population was likely founded by a single mitochondrial haplotype (Pe cnerov a et al., 2017).
To understand the cause and dynamics of the extinction of Siberian woolly mammoths, detailed information on the timing of its presence and absence in different geographic regions is of key importance.Yet, current extinction time estimates are based on the age of the latest individual sample, which is only a rough estimation for the presence or absence of a population (Bradshaw et al., 2012;Solow et al., 2006).The last known sample will always pre-date the real extinction event, but how close in time these are depends on the sampling density.One way to address this issue is by using Bayesian age models.These models can be used to statistically determine the start and end boundary of the presence of a group of organisms using their radiocarbon record as prior information, while accounting for sampling bias and dating error (e.g.Higham et al., 2014;Kosintsev et al., 2019;Metcalf et al., 2016).
The aim of this study was to investigate the extinction dynamics of the Siberian woolly mammoth on both the mainland and Wrangel Island by combining radiocarbon and genetic data.To do this we used evidence from Bayesian age models to estimate the disappearance dates of woolly mammoths in different geographic regions, and Bayesian phylogenetic trees, which enable inference of the genetic affinities between mammoths on Wrangel Island and different mainland regions.

Radiocarbon dating
We generated new radiocarbon age estimates for 88 woolly mammoth (Mammuthus primigenius) samples that were collected in northern Siberia, including Wrangel Island, during a series of expeditions in the last 20 years (Supplementary material 1).Dating was done in the following laboratories: Beta Analytic ( 14 C lab identification: Beta), the Oxford Radiocarbon Accelerator Unit (ORAU) ( 14 C lab identification: OxA), NEISRI FEB RAS, Magadan ( 14 C lab identification: MAG), St. Petersburg State University ( 14 C lab identification: LU), Uppsala University ( 14 C lab identification: Ua), University of Arizona ( 14 C lab identification: AA), Center for Accelerator Mass Spectrometry (CAMS) ( 14 C lab identification: CAMS).Samples dated at ORAU and Beta Analytic were pretreated to remove contaminants using ultrafiltration (Ramsey et al., 2004).In addition, we also collated previously published and validated radiocarbon dates from 626 specimens (see section 2.5 for more details).

Samples and laboratory analyses for DNA analysis
We analyzed new genomic data from 22 woolly mammoth specimens that had been collected in eastern Siberia and Wrangel Island (Table 1).To avoid contamination, we carried out all DNA extractions and subsequent library preparations in a dedicated ancient lab facility at the Swedish Museum of Natural History.Approximately 50 mg of bone, tusk or tooth powder was collected using a Dremel drill.DNA was extracted as described in Ersmark et al., (2015).We then prepared double stranded Illumina libraries following the protocol of Meyer and Kircher (2010) but replacing the SPRI Bead reaction clean-up steps with Qiagen MinElute columns and halving the reaction volumes.In addition, we carried out uracil-treatment with USER enzyme as described in Pe cnerov a et al. ( 2017) to remove damaged bases caused by cytosine deamination at overhangs.Indexing amplifications were performed in 25mL reaction volumes containing 1X AccuPrime reaction mix (Life Technologies), 0.3 mM of each indexing primer, 1.25 U AccuPrime Pfx DNA polymerase (Life Technologies) and 3 ml of template DNA with the following reaction conditions: 95 C for 2 min, followed by 10e16 cycles of 95 C for 15 s, 60 C for 30 s and 68 C for 30 s. Finally, we size selected the amplified libraries using Agencourt AMPure XP beads (Beckman Coulter) and quantified their concentration using a high sensitivity chip on the BioAnalyzer 2100 (Agilent) before pooling them in equimolar ratios.Samples labeled "MD" and "HM" (Table 1) were sequenced with Illumina NovaSeq technology (S4, 2x150bp), whereas all the other samples were sequenced with Illumina HiSeq2500 technology (High-output mode, 2x125bp).

Sequencing data processing
Paired-end reads were processed using an in-house developed pipeline written in Snakemake version 4.5.0 (K€ oster and Rahmann, 2012).In short, adapter trimming and read merging was done with default parameters using SeqPrep v1.2 (https://github.com/jstjohn/SeqPrep), with a minor modification to the source code to ensure the best quality score of the bases in the merged region, instead of the combined score (Palkopoulou et al., 2015).Following Pe cnerov a et al. ( 2017), a concatenated nuclear-mitochondrial reference assembly consisting of the African savanna elephant nuclear genome (LoxAfr4, Broad Institute) and the woolly mammoth mitochondrial genome (Genbank accession no.DQ188829 (Krause et al., 2006)) was created to avoid mapping of nuclear DNA of mitochondrial origin (NUMTs) to the mitochondrial reference genome.Merged reads were mapped against the reference using BWA aln v0.7.17 (Li and Durbin, 2010) with ancient DNA specific parameters that disable seeding (-l 16500), allow more substitutions (-n 0.01) and allow for up to two gaps (-o 2).Bam files were generated using BWA samse.PCR duplicates were removed using a modified script from Palkopoulou et al. (2015), which removes duplicates based on both the 5 0 and 3' end of the read.Indels were realigned using GATK v3.8e0 (McKenna et al., 2010).Finally, reads that mapped to the reference mitogenome with a filtering quality of at least 20 were extracted using SAMtools v1.6 (Li et al., 2009) (Supplementary material 2).
Processed BAM files were imported into Geneious (Kearse et al., 2012).Consensus sequences were called for positions with at least 3X coverage using the majority rule, with ambiguous or lowcoverage positions called as undetermined.Mitogenomes for which at least 80% of the positions were resolved were kept and aligned to 109 previously published clade I finitely-dated mitogenomes (Fig. 1, Supplementary material 3) and the Asian elephant reference (Genbank accession no.EF588275) with MAFFT v7.450 (Katoh and Standley, 2013) using the G-INS-I algorithm as implemented in Geneious.The variable number tandem repeats (VNTR) region (position 16164e16637) was removed from the alignment, since it had not been assembled in all previously published genomes (Gilbert et al, 2007(Gilbert et al, , 2008)).

Phylogenetic analysis
We conducted a Bayesian phylogenetic analysis using BEAST v1.8.4 (Drummond et al., 2012).The median calibrated radiocarbon age was used to tip-calibrate the phylogenetic tree.Samples were age calibrated in OxCal v4.3 (Ramsey, 2009a) using the IntCal 13 Northern Hemisphere radiocarbon calibration curve (Reimer et al., 2013).If the samples had been independently dated more than once, the R_Combine function was used in OxCal to obtain a median date.The HKY þ G substitution model was chosen, as determined by the Bayesian Information Criterion in jModeltest2.1.10(Darriba et al., 2012;Guindon and Gascuel, 2003).A strict molecular clock was applied and the mutation rate was estimated from the data with a uniform distribution, starting with a rate of 8.07 Â 10 À8 site year À1 (Palkopoulou et al., 2013).A constant population size model was applied using default settings.The Markov Chain Monte Carlo was run for 200 million generations and run diagnostics were sampled every 5000th generation.We checked run convergence in Tracer v1.7.1 (Rambaut et al., 2018), by making sure all run parameters had an ESS > 200.Information of the analysis was summarized using TreeAnnotator (http://beast.community/treeannotator) and 25% of the states were discarded as burn-in.Finally, the tree was visualized and polished in FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree).

Bayesian age models
We focused on several different regions of northern Siberia to build the Bayesian age models.To ensure that only reliable dates were used, all available radiocarbon dates were audited as described in Lister and Stuart (2013).Following this approach, finite dates of 720 north Siberian woolly mammoth specimens, east of 70 E and north of 60 N, were accepted including dates from both published literature and new ones reported here (Supplementary material 1).Samples were subdivided into four groups to reflect their geographic region, which for the purpose of simplicity are referred to as: eastern region (n ¼ 102), central region (n ¼ 262), western region (n ¼ 193), and Wrangel Island (n ¼ 163) (Table 1, Fig. 3).We used the Lena and Kolyma rivers to define the three mainland regions.Since Wrangel Island and the New Siberian Islands were connected to the Siberian mainland until ~10.5 kya, dates from these islands older than this date were categorized within the eastern and central group, respectively.Samples with dates <10.5 kya were categorized in the (Holocene) Wrangel Island group.
For each geographic group, we built a single phase model with a general outlier analysis (prior probability ¼ 0.05) (Ramsey, 2009b) in OxCal 4.3 (Ramsey, 2009a) using the IntCal 13 atmospheric curve (Reimer et al., 2013).Samples that were dated more than once were combined using the R_Combine function with a simple outlier analysis (prior probability ¼ 0.05) (Ramsey, 2009b).The general outlier model covers the uncertainty that a date may not be associated with the modelled event (i.e. the phase model in this case) whereas the simple outlier model deals with the uncertainty of the radiocarbon date itself being wrong.In both cases, the contribution of the date to the final age estimate is down weighed if dates are flagged as outliers.The Boundary function was used to calculate posterior density functions (PDFs) of the start and end date of each phase.The full Bayesian model is given in Supplementary material 4. All results are reported with a 95.4% confidence range in thousand calendar years before present (cal BP).

Phylogenetic analysis
Using the constant population size model, we obtained a mean mutation rate of the mitogenomes of 1.57 Â 10 À8 site year À1 (95% HPD interval [1.28e1.89Â 10 À8 site year À1 ]).Consistent with previous studies (Chang et al., 2017;Enk et al., 2016), there was a wellsupported (posterior probability > 0.95) genetic distinction between clade I mammoths from North America and Eurasia, with the exception of three North American samples, all of them >20 ky cal BP, that grouped within the Eurasian clade (Fig. 2).The Holocene Wrangel Island mammoths formed a monophyletic clade within the Eurasian clade (posterior probability >0.95).Mitogenome relationships within the Holocene Wrangel Island population were poorly resolved, indicating rapid diversification.The mainland individuals most closely related to the Holocene Wrangel clade were those from central (New Siberian Islands) and the western region (Taimyr Peninsula), as well as a 15 ky cal BP sample from Europe.

Bayesian age model
In our Bayesian age model, the start boundary represents the limit of radiocarbon dating for the eastern, central and western regions, whereas the end boundary is an estimate of the last appearance date e and thus local extinction e of woolly mammoths in that region (Table 2).For Wrangel Island after the lateglacial/early Holocene break, the start boundary represents the first appearance of woolly mammoths on Wrangel Island, while the end boundary marks the global extinction of the species.
The model revealed a time gap of almost 3000 years between the disappearance of mammoths in the eastern region (14.2e12.9ky cal BP) and their appearance (10.2e9.8 ky cal BP) on Wrangel Island.Mammoths persisted longer in the central region (end boundary 11.0e10.2ky cal BP) and western region (11.2e10.4ky cal BP) before also disappearing there.Finally, our analysis suggests 4.2e3.9ky cal BP as the timing for the final extinction of the woolly mammoth on Wrangel Island (Table 2; Fig. 4).

Timing of regional extinctions
Consistent with previous studies (Lister and Stuart, 2008;Markova et al., 2013;Puzachenko et al., 2017), our Bayesian age models show that the extinction of the woolly mammoth did not occur as a single event, but was characterized by mammoths disappearing in different geographical regions at different times.The ultimate cause of mammoth extinction is still under debate (Lorenzen et al., 2011;Nikolskiy et al., 2011).However, vegetation changes driven by climate change likely played a key role in the mammoth's range contraction after the Last Glacial Maximum (LGM, 28.6e22.5 ky cal BP) (Lister and Stuart, 2008;Kuzmin 2010).At the end of the Late Pleistocene, beginning ca. 15 kya, the climate started to change; wetter and warmer weather conditions led to the gradual replacement of dry and productive "mammoth steppe" (Guthrie, 1990) in Siberia with wet tundra and temperate and boreal forests (Allen et al., 2010;Binney et al., 2017;Huntley et al., 2013;Sher, 1997).These unfavorable conditions likely forced mammoths to retreat to the treeless north ( Ri c ankov a et al., 2015).Interestingly, our results show that the disappearance of woolly mammoths in the eastern region between 14.2 and 12.9 ky cal BP does not coincide with the onset of warmer conditions of the  BøllingeAllerød period (14.7e12.9ky cal BP).This delayed extinction response might be explained by the lagged and gradual expansion of the tree line in northeastern Siberia (Allen et al., 2010;Huntley et al., 2013), but in Chukotka it still precedes the main expansion of the range of trees at 11.5e8.8ka cal BP (Vartanyan et al. in prep.).The mammoth steppe persisted even longer in the northernmost parts of Siberia (Binney et al., 2017;Sher, 1997), allowing mammoths to survive there in small refugia (Lister and Stuart, 2008) before eventually going extinct on the mainland in the early Holocene.
Current estimates for the separation of Wrangel Island from the eastern Siberian mainland, based on bathymetric data, range between 10.5 and 10.0 ky cal BP (Vartanyan et al., 2008).If the observed appearance date in our Bayesian age model (10.2e9.8 ky cal BP) represents the Holocene colonization of Wrangel, the island cannot have been separated by rising sea levels earlier than 10.2 ky cal BP.
Previous studies have assumed that the extinction of the woolly mammoth on Wrangel Island took place some time after the date of the youngest sample, which has been dated to 4024 yr cal BP (Ua-13366, 3685 ± 60 yr BP) (Vartanyan et al., 2008).However, it has been unclear how long after this the extinction occurred.Our Bayesian model suggests that mammoths disappeared no later than 100 years after that date (Table 2), indicating that the youngest known sample does indeed represent one of the last surviving mammoths.The cause of the mammoth's extinction on Wrangel Island remains uncertain.Environmental stochasticity (Arppe et al., 2019), genetic processes due to small population size (Fry et al., 2020;Palkopoulou et al., 2015;Pe cnerov a et al., 2017;Rogers and Slatkin, 2017) and human hunting have been proposed as possible causes.Based on our Bayesian age model results, the latter hypothesis seems less likely since the first and thus far only known prehistoric occurrence of humans on Wrangel Island has been dated to 3583 y cal BP (Ua-18086, 3345 ± 70 yr BP) on associated wood (Gerasimov et al., 2006).

Geographical patterns and the origin of the Holocene Wrangel Island population
Our Bayesian age models suggest that mammoths went extinct in the eastern region almost three millennia before reappearing on Wrangel Island during the early Holocene.However, as the model assumes a uniform occupation structure, it cannot fully exclude the possibility that mammoths were present as a continuous population but at low density during the observed gap.On the other hand, our model does suggest that mammoths survived in the northernmost parts of the central and western region during the Younger Dryas (12.9e11.7 ky cal BP) and the early Holocene (11.7e10.2ky cal BP).
To further investigate the origin of the Holocene Wrangel Island mammoth population, we analyzed mitochondrial genomes of clade I mammoths across both Eurasia and North America.Our phylogenetic analysis shows that the Holocene Wrangel Island mammoth population was most closely related to mammoths from the western and central regions, and not to mammoths from the geographically closer eastern region or North America.This indicates a difference in genetic composition between the Pleistocene and Holocene Wrangel populations, and supports the hypothesis that the chronological age gap in the eastern region (including Wrangel) from ca. 13e10 kya ago is effectively caused by the absence of mammoths in that region.Furthermore, in agreement with our Bayesian age models, the close genetic relationship of the Holocene Wrangel island population with populations to the west indicates that the source population of Holocene Wrangel Island either came from the central and/or western region or was closely related to that population (Pe cnerov a et al., 2017).As such, we suggest a model where mammoths recolonized Wrangel Island by migrating over regions now submerged by the Laptev and East Siberian Seas.
The Siberian coastline extended further northwards during the Late Pleistocene and mammoths likely inhabited these now submerged northern Siberian plains (on what is today Severnaya Zemlya).Whereas mammoths were widespread during the LGM (Fig. 5 -A), they retreated to the north during the Late Glacial.From 12.9 ky cal BP, our results suggest that mammoths were confined to regions north of 72 N, including the central region (on what is today the New Siberian Islands), western region (on the Taimyr Peninsula), but also the northern-Siberian plains (Fig. 5 -B).However, Wrangel Island lies below 72 N, which conforms to the mammoth's disappearance on Wrangel during this time.Intriguingly, mammoths expanded southwards and recolonized Wrangel Island e which was then still part of the Beringian mainland e during the early Holocene.This expansion of range during a time of warm climate demonstrates that the mammoth's response to climate change was complex.Shortly after the warming events of the Holocene, however, sea levels started rising and mammoths rapidly became trapped on Wrangel Island (Fig. 5 e C).From our data we cannot determine whether mammoths recolonized Wrangel Island from the central or western regions by migrating over the northern Siberian plains, or from an undocumented resident population in the Northeastern Siberian plains.

Conclusions
On a continental scale it is clear that the Late-glacial to Holocene expansion of forest, driven by climate warming, led to a massive reduction of open habitats across northern Eurasia, and to thereby to a reduction in range of the mammoth and other 'steppe-tundra' inhabitants (Guthrie 1990;Sher 1997).However, on a fine scale, this process comprised a complex spatiotemporal pattern of changes in climate and vegetation, survival of mammal species in refugia, local extirpations, and temporary range expansions (Lister and Stuart 2008).We have demonstrated how modelling a combination of genetic, dating and biogeographic data can bring focus to the details of this process and hence to the pattern and potentially causes of final extinction.The additional role of humans in this process cannot be discounted, but for northeast Siberia at least, archeological data are too limited to test this possibility.
In principle, our approach of combining Bayesian age models and genetic data can be applied to identifying population replacements and extirpations in other mammalian species.This is currently limited in part by the availability of data, which for the mammoth is far more extensive than for any other species.The woolly rhinoceros also became extinct, but did so more synchronously across its range, and before the Holocene.Other species' ranges also broadly retreated to the north during this interval, including bison, horse and musk oxen, but these did not become globally extinct.With further data, the approach highlighted here provides a potential key to understanding these different patterns and potentially to better predicting the response of threatened northern species to climate change today.

Fig. 1 .
Fig. 1.Map showing sampling locations of Mammuthus sp.samples used in the mitogenomes phylogenetic tree reconstruction.Blue dots ¼ western region, yellow diamonds ¼ central region, red triangles¼ eastern region and Wrangel Island, green squares ¼ North America, purple crosses ¼ Europe.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 3 .
Fig. 3. Map showing sampling locations of North Siberian Mammuthus primigenius samples used to construct age-models.Blue dots¼ western region, yellow diamonds¼ central region, red triangles¼ eastern region and Wrangel Island.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 5 .
Fig. 5. Shifting woolly mammoth distribution in Siberia.Present-day shorelines are shown in dark grey.Coastlines as they were during the Late Pleistocene and middle Holocene are shown in light grey.The 72nd parallel north is shown as a dotted line.Blue dots represent sample locations used in the Bayesian age models (A) Sample locations of mammoth remains older than 12.9 ky cal BP. (B) Sample locations of mammoth remains between 12.9 and 10.2 ky cal BP. (C) Sample locations of mammoth remains younger than 10.2 ky cal BP. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Table 1
Geographic regions used as phase for Bayesian age modelling.

Table 2
Results of the Bayesian age model.Range is given in years cal BP (95.4% probability).