Multiproxy analysis of permafrost preserved faeces provides an unprecedented insight into the diets and habitats of extinct and extant megafauna

The study of faecal samples to reconstruct the diets and habitats of extinct megafauna has traditionally relied on pollen and macrofossil analysis. DNA metabarcoding has emerged as a valuable tool to complement and refine these proxies. While published studies have compared the results of these three proxies for sediments, this comparison is currently lacking for permafrost preserved mammal faeces. Moreover, most metabarcoding studies have focused on a single plant-specific DNAmarker region. In this study, we target both the commonly used chloroplast trnL P6 loop as well as nuclear ribosomal ITS (nrITS). The latter can increase taxonomic resolution of plant identifications but requires DNA to be relatively well preserved because of the target length (~300e500 bp). We compare DNA results to pollen and macrofossil analyses from permafrost and ice-preserved faeces of Pleistocene and Holocene megafauna. Samples include woolly mammoth, horse, steppe bison as well as Holocene and extant caribou. Most plant identifications were found using DNA, likely because the studied faeces contained many vegetative remains that could not be identified using macrofossils or pollen. Several taxa were, however, identified to lower taxonomic levels uniquely with macrofossil and pollen analysis. The nrITS marker provides species level taxonomic resolution for commonly encountered plant families that are hard to distinguish using the other proxies (e.g. Asteraceae, Cyperaceae and Poaceae). Integrating the results from all proxies, we are able to accurately reconstruct known diets and habitats of the extant caribou. Applying this approach to the extinct mammals, we find that the Holocene horse and steppe bison were not strict grazers but mixed feeders living in a marshy wetland environment. The mammoths showed highly varying diets from different non-analogous habitats. This confirms the presence of a mosaic of habitats in the Pleistocene ‘mammoth steppe’ that mammoths could fully exploit due to their flexibility in food choice. © 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). er, Leiden, the Netherlands.


Introduction
During much of the Late Pleistocene epoch, Siberia, Alaska and northern Canada were connected, forming a dry and largely treeless landmass known as Beringia (Hopkins, 1959;Hopkins et al., 1982). The landscape was dominated by emblematic megafauna such as woolly mammoths and steppe bison, and in terms of biomass some authors have compared this period to the current African savannah (Zimov et al., 2012). Mammals had a major role in shaping vegetation community and structure by reducing vegetation density, enhancing nutrient turnover, dispersing seeds and reducing fire potential (Guthrie, 2001;Hester et al., 2006;Johnson, 2009). Reconstructing the species composition of this former plant community without a modern analogue, as well as the corresponding diets of the mammals that roamed it has been challenging.
According to Guthrie (1990) there were mainly open landscapes with highly productive graminoids and Artemisia sp. in a steppetundra biome that is often designated the 'mammoth steppe'. Recent studies have changed the view of the mammoth steppe vegetation into a more heterogeneous mosaic of different habitats. This mosaic consisted of areas rich in shrubs combined with permanent moist areas and productive grasslands (Chytrý et al., 2019;Lozhkin et al., 2019;Zazula et al., 2006). Willerslev et al. (2014) further showed that forbs (non-graminoid herbaceous vascular plants) were more abundant in the environment than previously thought, and featured in megafaunal diets to provide important proteins. Relatively little is known, however, about the specific plant species in megafaunal diets.
The shift in appreciation of the Beringian megafaunal habitats has been catalysed by a growing body of research that uses a multidisciplinary approach, combining pollen and plant macrofossils with DNA metabarcoding (Boast et al., 2018;Gravendeel et al., 2014;Haarsma et al., 2016;Hofreiter et al., 2000;Sønstebø et al., 2010;van Geel et al., 2008van Geel et al., , 2011avan Geel et al., , 2011bWillerslev et al., 2014). By improving taxonomic resolution and finding complementary taxa, DNA metabarcoding can help to resolve vegetation classifications where species resolution is required (e.g. steppe and tundra, partly defined on distinct species of grass; Swanson, 2006). Several studies on lake sediments have shown that instead of replacing traditional methods, DNA metabarcoding acts as a complementary proxy by revealing both additional taxa and providing increased taxonomic resolution (see e.g. Boessenkool et al., 2014;Parducci et al., 2019;Pedersen et al., 2013;Rawlence et al., 2014). While pollen grains mostly show a regional signal due to dominant wind-dispersed pollen (grasses and Artemisia sp.), DNA may represent a more local signal that is more similar to the spectrum of macrofossil taxa (Alsos et al., 2018;Boessenkool et al., 2014;Jorgensen et al., 2012).
While the studies cited above provide a good overview of the advantages and drawbacks of the different proxies used, all of these studies focussed on lake sediments. So far, there are few studies comparing these proxies in megafaunal faecal samples (e.g. Gravendeel et al., 2014;Hofreiter et al., 2003;van Geel et al., 2008). Strictly speaking, the faecal samples of extinct megafauna are not coprolites since they are not fossilized but perfectly preserved in permafrost. However, the plant macrofossils in these samples are drastically affected by masticatory and digestive processes, which may result in differential preservation of taxa and fragments becoming unidentifiable (van Geel et al., 2008). For pollen recovered from faeces an additional complicating factor is that the faecal samples are often dominated by wind-transported pollen or pollen deriving from ingestion of inflorescences from plants that were flowering at the time of consumption ( . The advantage of DNA as a proxy for dietary reconstruction is that it does not depend on flowering time or time of fruit setting, as vegetative plant remains are included in the DNA record (Willerslev et al., 2014). However, as in ancient sediments, not all taxa are recorded using DNA metabarcoding due to incomplete reference libraries, PCR bias, primer mismatches and DNA degradation (Jorgensen et al., 2012).
Most studies of ancient DNA from sediments have relied either on the P6 loop of the chloroplast trnL (UAA) intron or the rbcL gene, and both give good taxonomic resolution for some plant taxa but limited for others (Sønstebø et al., 2010;Taberlet et al., 2006). While in the animal kingdom the mitochondrial marker COI can be used as a universal barcode for identifying species (Hebert et al., 2003) no such universal barcode has been identified for plants. For this reason a combination of markers has been advised for plants, including both a nuclear marker and a plastid marker (CBOL Plant Working Group et al., 2011). Since permafrost acts as an excellent natural freezer, even long DNA fragments (up to 510 bp) have been recovered from sediments as old as 400 kyr (Lydolph et al., 2005;Willerslev et al., 2014). Yet in the study of ancient megafaunal faeces, the relatively long nuclear ribosomal ITS (nrITS) has rarely been used, and only to amplify relatively short amplicons (e.g. 240 bp in the Cape Blossom mammoth; van Geel et al., 2011b). Due to its length, nrITS has the advantage of being able to provide a higher taxonomic resolution, which in turn can give better insight into the palaeoenvironmental conditions represented by the taxa in a sample.
In this study, we aim to 1) investigate the potential of using the nrITS marker on megafaunal faeces, 2) compare the nrITS results to trnL, pollen and macrofossil records and 3) integrate results of all proxies to obtain a detailed reconstruction of ancient megafaunal diets and habitats. To this end, we applied DNA metabarcoding, pollen and macrofossil analysis on a variety of permafrost and icepreserved faecal samples from extinct and extant megafauna, specifically woolly mammoth, steppe bison, horse and caribou. In addition to the trnL P6 loop, we target the nrITS regions nrITS1 and nrITS2. The wide temporal range of the samples (28,000 to modern) further allows us to capture potential taphonomic effects on the recovery of the different marker regions and read counts, while inclusion of faecal samples from extant caribou with known diets and habitats enables validation of the diet and habitat reconstructions of the extinct megafauna.

Material
Eleven faecal samples from four mammal species were included (Table 1; for detailed information about location and dating see Table S1). Several of the samples we used here have been studied previously and DNA from the original material -which was stored at À80 C -was re-extracted and analysed here, except for the Oyogas Yar horse and Yakutian bison of which DNA extracts from previous studies were used (CTAB DNA extraction;Doyle and Doyle, 1987). All samples are derived from Russia, Canada and USA (Fig. 1) and are briefly discussed below.

Holocene and modern mountain caribou
Three northern mountain caribou (Rangifer tarandus caribou (Gmelin, 1788)) faecal samples were collected from cores in ice patch deposits in the Selwyn Mountains, Northwest Territories, Canada. Caribou visit these ice patches during the summer months to escape summer heat and insect harassment and their faeces are subsequently buried by snow creating stratigraphically discrete faecal bands that are very well preserved. The samples include faeces from modern caribou collected from the surface near the ice patch (Selwyn A), and two samples of late Holocene age collected from the ice core, Selwyn B and Selwyn C. From Selwyn A, DNA was retrieved by Galloway et al. (2012) confirming that caribou was M. Polling, A.T.M. ter Schure, B. van Geel et al. Quaternary Science Reviews 267 (2021) 107084 indeed the producer of the faeces. For the other samples, the faecal material was identified as being deposited by caribou based on the general shape, size and texture of the pellets, without additional DNA confirmation.

Holocene bison and horse
A colon sample of a horse (Oyogas Yar or Yukagir horse; Equus cf. lenensis Russanov, 1968) of middle Holocene age and a rumen sample of a Yakutian steppe bison (Bison priscus (Bojanus, 1825)) of early Holocene age were taken directly from permafrost preserved animals from the Sakha Republic, Russia (Boeskorov et al., 2014;Gravendeel et al., 2014; (Table 1). The Oyogas Yar horse was identified as being most closely related to the extinct Lena horse, Equus lenensis, based on body size measurements (Boeskorov et al., 2018).

Pleistocene mammoth and horse
Six Pleistocene faecal samples were analysed, including five woolly mammoths (Mammuthus primigenius (Blumenbach, 1799)) and one Yukon horse (Equus lambei (Hay, 1917)). Four specimens were obtained from the republic of Sakha (Yakutia), Russia, including the Maly Lyakhovsky, Abyland, Adycha and Yukagir mammoths. The Cape Blossom mammoth sample (or Alaskan Late Glacial mammoth) was obtained from Cape Blossom, Alaska, USA, and the Yukon horse was obtained from Dawson City, Yukon, Canada. Faecal samples were taken directly from, or in close vicinity Table 1 Overview of the samples used in this study including the existing and newly generated data, source of material and their age and collection locality. References from where the existing data was taken are [1] Galloway et al. (2012) [2] Boeskorov et al. (2014) [3] Gravendeel et al. (2014) [4]  [5] van Geel et al. (2011b) [6] van Geel et al. (2008) [7] Harington and Eggleston-Stott (1996). *D ¼ DNA, M ¼ plant macrofossils, P ¼ pollen. yDNA extract from previous study used.  van Geel et al. Quaternary Science Reviews 267 (2021) 107084 to the permafrost preserved animals, except for the Abyland, Adycha and Cape Blossom samples which were loose faeces. Validation of the faeces as being derived from woolly mammoth for the Yukagir, Maly Lyakhovsky, Cape Blossom and Yukon samples is based on previous studies (Grigoriev et al., 2017;Harington and Eggleston-Stott, 1996; van Geel et al., 2008van Geel et al., , 2011b. The identities of the Adycha and Abyland samples were confirmed using Sanger DNA analyses (Supplementary Text S2).

Radiocarbon dating
Radiocarbon dates of the caribou, horse, bison and Cape Blossom and Yukagir mammoth faeces were reported in previous publications (Boeskorov et al., 2014;Galloway et al., 2012;Gravendeel et al., 2014;Harington and Eggleston-Stott, 1996;van Geel et al., 2008van Geel et al., , 2011b. The faecal samples of the Adycha, Abyland and Maly Lyakhovsky mammoths were dated at the AMS facility of the Center for Isotope Research of the University of Groningen (The Netherlands). The 14 C ages are reported in BP, the conventional unit, and includes a correction for isotope fractionation and a defined half-life ( Van der Plicht and Hogg, 2006). The 14 C dates are calibrated into calendar ages using the presently recommended calibration curve IntCal20 (Reimer et al., 2020). The calibrated dates are reported in cal. BP, defined as calendar years relative to AD 1950 (Table S1).

Pollen and macrofossils
If available, pollen and macrofossil results were taken directly from published records (Table 1). Data was available for the Yukagir and Cape Blossom mammoths, the Yakutian bison, Oyogas Yar horse and two of the Selwyn caribou samples (Galloway et al., 2012;Gravendeel et al., 2014;van Geel et al., 2008van Geel et al., , 2011b. For Selwyn caribou A, only a pollen analysis was available (Galloway et al., 2012). If multiple counts were present from different subsamples, these were averaged to obtain one pollen count per sample. Macrofossil results for the Yukon horse were generated by Paleotec Services, Canada. This sample was previously studied for its plant DNA using trnL by Willerslev et al. (2014).
Pollen and spores (hereafter 'pollen') counts and macrofossil analysis were performed for the faeces of the Abyland, Adycha and Maly Lyakhovsky mammoths, Yukon horse (only pollen) and Selwyn caribou A (only macrofossil). The method for pollen preparation followed Faegri and Iversen (1989). Samples for pollen and macrofossil analyses were taken from the core of the faeces. Microscopic analysis of pollen was done at 400X and 1000X magnification. Pollen identifications were based on Moore et al. (1991) and Beug (2004) and a pollen reference collection. For the preparation of macrofossils, Mauquoy and Van Geel (2007) was followed. Bryophyte specimens were identified using Lawton (1971), Crum et al. (1981) and Vitt and Buck (1992).
In pollen analysis, the use of 'types' is common to denote a group of taxa that produce pollen that cannot be identified to a lower taxonomic level using microscopic analysis. Potentilla-type pollen for example includes pollen from species of the genera Potentilla, Comarum, Fragaria and Sibbaldia (Reitsma, 1966), which are all part of the subtribe Fragariinae of the Rosaceae family. All 'type' identifications were therefore converted to their corresponding maximum taxonomic level so as to better compare them to the DNA and macrofossil data. Similarly, the commonly used Asteraceae pollen subdivision Tubuliflorae and Liguliflorae were converted to Asteraceae subfamilies Asteroideae and Cichorioideae, respectively. All pre-PCR aDNA work (including subsampling) took place in the dedicated ancient DNA laboratory of Naturalis Biodiversity Center (Leiden, The Netherlands). We subsampled the faecal samples following recommendations of Cooper and Poinar (2000) and Wood and Wilmshurst (2016). Samples were UVC-irradiated for 5 min and the outer layer (±2 mm) removed with a clean scalpel. This process was repeated before taking three subsamples (±100 mg each) from the middle of the bisected samples.
The subsamples were ground in a Retsch CryoMill at À196 C, before DNA was extracted separately for each subsample following the silica-based extraction protocol of Rohland and Hofreiter (2007), adjusted to the smaller volume of material used as described in Stech et al. (2011). DNA extracts from the three subsamples were then pooled together. To control crosscontamination, DNA extractions were carried out in batches of two to three samples with one extraction blank (excluding faecal material) included in each batch (in total five extraction blanks).
A dual-indexing approach was applied using a set of unique primer-adapter combinations as described in Fadrosh et al. (2014). All DNA extracts were diluted 1:10, except for the Abyland and Cape Blossom mammoths, for which a 1:50 dilution was used. PCRs were carried out on a Bio-Rad C1000 Touch or Bio-Rad S1000 thermal cycler in 25 ml final volumes consisting of 15.4 ml nuclease-free ultrapure water, 1x Phire Green reaction buffer, 0.52 mM of each primer, 1.25 mM of dNTPs, 1 U Phire Hot Start II DNA Polymerase and 1 ml of the 1:10 or 1:50 diluted DNA sample template. Gradient PCR results were used to determine the optimum annealing temperature for each primer set. The following amplification protocol was used: a 30 s activation step at 98 C, 40 cycles including 5 s at 98 C, 5 s annealing at 55e60 C (depending on primers used; Table S3) and 15 s elongation at 72 C, plus a final extension step at 72 C for 5 min.
In order to mitigate stochasticity of DNA results, three PCR replicates were used for all samples using a unique tag combination for each replicate. Coelogyne fimbriata (Orchidaceae), native to tropical SE Asia, was used as a positive control for each primer set. The resulting PCR products were pooled into two pools based on amplicon length: a pool containing the shorter trnL fragments and a pool containing the longer nrITS fragments. Equimolar pools were made after measuring DNA concentrations on a QIAXcel (Qiagen). The pools were purified using Agencourt AMPure XP beads (Beckman Coulter), with a 1:0.9 (nrITS) or 1:1 (trnL) ratio and quantified using an Agilent 2100 Bioanalyzer DNA High sensitivity chip. Illumina adapters were ligated onto the amplicons using TruSeq DNA Nano Library Preparation kit (Illumina, USA) and subsequently sequenced at the Norwegian Sequencing Center on an Illumina MiSeq v2 300 cycles (150 bp x 2) for the trnL fragments and an Illumina MiSeq v3 600 cycles (300 bp x 2) for the nrITS fragments. The mitochondrial Sanger sequencing reads obtained from the Abyland and Adycha faeces were aligned and trimmed using Bio-Edit version 7.2.5 (Supplementary Text S2; Hall, 1999). A Mega-BLAST search was performed to check the resulting consensus sequences against the NCBI nucleotide database, and only sequences resulting in percentage ID > 98% were kept.

NrITS sequences
The three pools of nrITS sequences (plant nrITS1 and nrITS2, fungal nrITS2) were analysed separately with a custom pipeline on the OpenStack environment of Naturalis Biodiversity Center through a Galaxy instance (Afgan et al., 2018). Paired-end reads were first merged with PEAR (Zhang et al., 2014) using the standard settings and discarding non-merged reads. Amplicons were subsequently demultiplexed using the linked adapters option in Cutadapt version 2.8 (Martin, 2011). Only sequences containing both unique sample tags and forward and reverse primers were kept. Primer sequences were subsequently removed from the sequences with Cutadapt, allowing a maximum error rate of 0.15 (i.e. 3 to 4 bases).
The sequences were quality filtered and trimmed using the PRINSEQ sequence filter/converter tool (Schmieder and Edwards, 2011), using a minimum mean quality score of 20 and removing any sequences shorter than 150 bp. Sequences were dereplicated and sorted by size in VSEARCH v2.14.2 (Rognes et al., 2016) and clustered into Operational Taxonomic Units (OTUs) using the unoise3 algorithm from USEARCH v11.0.667 (Edgar, 2016) with default settings, removing singletons and potential chimeras. OTUs were subsequently identified using a MegaBLAST search against the NCBI Genbank nucleotide database for plant nrITS1 and nrITS2 (Benson et al., 2012), and the UNITE fungal nucleotide database for fungal nrITS2 (Nilsson et al., 2019). OTUs that matched at least 80% in coverage as well as identity to NCBI Genbank were kept. For final taxon identifications, a minimum of 80% identity recognition for family, 90% identity for genus and 97% for the species level was used. Sequences were further filtered in R (version 3.5.2) (R Core Team, 2013) to remove sequences with a lower number of reads from any of the samples than in negative controls (either extraction or PCR). This resulted in removal of suspected food contaminants including Pisum sativum, Brassica rapa/napus for nrITS1 and Citrus sp., Cucumis sativus and Musa sp. for plant nrITS2. For plant nrITS1 and nrITS2, the positive control was successfully amplified and the presence of Coelogyne fimbriata reads in the non-control samples was used to determine an OTU filtering threshold to correct for potential leakage. For nrITS2, this resulted in reduction of each sequence read count per replicate with 0.3%, while this value was 0.35% for nrITS1 and fungal nrITS2 (see Table S5.1 for full steps and read counts). Remaining replicates were merged while averaging the read counts per OTU. Finally, OTUs at species or genus level with the same taxonomic assignment were aggregated.
A curated arctic and boreal vascular plant and bryophyte database exists for trnL (see below), but not yet for nrITS. The plant nrITS results have therefore been carefully checked for their presence in the geographical areas where the faeces were collected. To this end, the Panarctic Flora (Elven et al., 2011), database of vascular plants of Canada (VASCAN) (Brouillet et al., 2010), GBIF (www.gbif. org) and the Plants of the World Online (POWO, 2019) were used (Boufford et al., 2016;Brouillet et al., 2010;Cody, 2000). This resulted in some aberrant records, such as non-boreal/tropical plants (e.g. Celtis sp. and Pteroceltis sp.) as well as some likely food contaminants (e.g. Allium cepa, Lagenaria siceraria) and these were manually removed (Supplementary Information S4). When many blast hits from different species with an equal BIT-score were found, the top 20 blast hits were manually checked for likely boreal species. When several species met this criterion, the last common ancestor of these hits was chosen. Fungal OTUs were assigned to functional groups (guilds) using FUNGuild (Nguyen et al., 2016).

TrnL sequences
The trnL sequences were analysed with the OBITools package (Boyer et al., 2016). OBITools is commonly used in ancient plant DNA studies with trnL as it allows direct assignment of sequences to taxa. The forward and reverse reads were assembled using illuminapairedend (min quality score of 40) and subsequently assigned to the corresponding samples using ngsfilter (only keeping sequences with a 100% tag match and allowing for a maximum of three mismatches with the primers). Using obiuniq, strictly identical sequences were merged, after which obigrep was used to remove singletons, sequences with ambiguous nucleotides and sequences shorter than 10 bp. Following Bellemain et al. (2013), obiclean was used to identify sequencing and amplification errors with a threshold ratio of 5% for reclassification of sequences identified as 'internal' to their corresponding 'head' sequence. The resulting sequences were compared to two taxonomic databases using ecotag. The first priority was given to a local taxonomic reference library containing arctic and boreal vascular plant taxa and bryophytes (arctborbryo database; Soininen et al., 2015;Sønstebø et al., 2010;Willerslev et al., 2014). A second reference library based on the global EMBL database (release 137) was used for mitigation of missing taxonomic assignment due to species potentially lacking in the first database (see Table S5.2 for full steps and read counts). The computations were performed on resources provided by UNINETT Sigma2 -the National Infrastructure for High Performance Computing and Data Storage in Norway.
The resulting sequences were further filtered in R to remove sequences that had (a) < 100% identity match to the reference libraries, (b) < 10 reads per PCR repeat and (c) sequences with higher number of reads in negative controls compared to the samples. This process resulted in the removal of suspected contaminant sequences derived from modern food plants such as Solanum subgenus Lycopersicon and Oryza sp. as well as some potential true positives including the genera Solidago, Trifolium and Helictochloa.
No Coelogyne fimbriata reads were recorded in the positive control for trnL, despite the presence of C. fimbriata sequences in the NCBI Genbank database (e.g. MK356212.1). The presence of C. fimbriata reads in the non-control samples to determine the MOTU filtering threshold (as was used for nrITS filtering) could therefore not be used. Instead, the maximum number of reads from the most abundant OTU (Salix sp.) in control samples was used, and accordingly each sequence read count per replicate was reduced with 1.0%. Remaining replicates were merged while averaging the read counts per OTU. Finally, OTUs at species or genus level with the same taxonomic assignment were aggregated.
Although this filtering resulted in losing potential true positives, these were only present in a low number of reads (<0.1% of the total number of reads). Furthermore, this relatively rigorous filtering allowed for removal of nearly all suspected false positives in the samples, and this was given preference over retaining as many true positives as possible (cf. Alsos et al., 2018). Remaining identifications were manually checked for suspected contaminants or taxa that were known not to occur in the arctic and boreal region. This process resulted in the removal of a few remaining suspected contaminants ( Supplementary Information S4). This is a common problem in metabarcoding studies, and the taxa we identify are similar to those found in other studies (Chua et al.,

Diet analysis and habitat types
The DNA reads were converted to relative read abundances to facilitate comparison with macrofossil and pollen data. When referring to 'diet' in this study from now on, we refer to the composition of the last meal consumed by the animals studied here, as inferred through the multiproxy approach on the faecal samples. The taxon identifications were grouped into the major groups of graminoids (grasses, sedges, rushes), forbs, shrubs/deciduous trees, coniferous trees, mosses and lichens. Since pollen records are biased towards high pollen producers and show primarily a regional signal (Jorgensen et al., 2012), they cannot be used to reliably reconstruct the diet. The record of macrofossils is strongly influenced by the food choice of the animal during its last meal (Mol et al., 2006) and has been shown to largely overlap with aDNA results (Parducci et al., 2015). Therefore, to provide a visual representation of the last diets, the average values of the relative abundance of the macrofossil results and all available DNA results were taken.
Plant identifications from DNA, macrofossils and pollen that could be assigned to the species level were used to reconstruct the habitat types of the megafaunal last diets. Some genera that are typically found in specific habitats have also been included (e.g. Eriophorum, Juncus in wetlands and Puccinellia in saline meadows). Habitat types were identified using a combination of sources: efloras (Brach and Song, 2006), Kienast et al. (2005) Only the presence of taxa and not their abundance was used to reconstruct the habitats, since abundance of certain taxa is highly affected by the selective food choice of the animals and may not reflect the palaeovegetation (Ashastina et al., 2018). The taxa were divided into 13 habitat types, ranging from relatively dry (steppe) to very wet (wetland: marsh, bog, fen, swamp). The modern known habitat preferences for the plant species were used, and the resulting habitat types are compared to modern analogues. For the modern caribou (Selwyn caribou A), the habitat consists of boreal forest in low-elevation areas, found together with arcticalpine tundra at high altitudes (Galloway et al., 2012).

Mammal sample identity
Genetic analyses confirmed the identity of both the Abyland and Adycha samples as Mammuthus primigenius (woolly mammoth), with a 100% match in both cover and identity (Table S2). This was further supported by the shape and size of the faecal pellets.

Pollen
For seven mammals, the pollen records were taken from the published records while four were newly generated in this study (Tables S6.1 e S6.11). The Selwyn caribou samples studied by Galloway et al. (2012) showed a mixed pollen signal with trees (ranging from 25 to 30%, Picea sp., Pinus sp., Alnus sp. etc.) and forbs (34e40%, mostly Artemisia sp.) being the most abundant. Selwyn caribou A further showed 33% shrubs (Salix sp. and Betula sp.) which were missing in Selwyn B, and rare (6%) in Selwyn C. Low amounts (<10%) of undifferentiated Poaceae as well as insectdispersed pollen (e.g. Asteraceae, Ericaceae, Polemonium sp. and Rosaceae) were identified in all three caribou samples.
The Holocene Yakutian bison and Oyogas Yar horse had high amounts of undifferentiated Poaceae pollen (71% and 92%, respectively; Gravendeel et al., 2014; . Cyperaceae was the second most abundant pollen type (4%) in the horse and also accounted for 6% in the bison sample. The bison further had a relatively high amount (9%) of Apiaceae pollen. Other pollen in both samples was derived from various shrubs (Betula sp. and Salix sp.) and forbs (e.g. Asteraceae, Plantaginaceae, Rosaceae). Tree-derived pollen (Abies sp., Pinus sp. and Alnus sp.) was present in both samples and made up 3e4% of the total.
The previously studied Yukagir and Cape Blossom mammoths showed abundant wind-dispersed pollen types consisting of Poaceae (both~70%) and Artemisia sp. (16% and 7%, respectively; van Geel et al., 2008;van Geel et al., 2011b). The newly obtained pollen results from the three Pleistocene mammoths (Abyland, Adycha, Maly Lyakhovsky) as well as the Yukon horse were also dominated by Poaceae and Artemisia sp. (>85%). The only sample with a low Artemisia count (1%) was the Maly Lyakhovsky mammoth, which was for 97% dominated by Poaceae. Insectdispersed pollen types were rare to very rare in all Pleistocene samples and were derived from many different families, e.g. Apiaceae, Brassicaceae, Caryophyllaceae and Papaveraceae. The only sample with coniferous tree derived pollen was the Adycha mammoth with 1% Pinus sp. pollen.

Macrofossils
Macrofossil analyses were taken from published records for eight samples and newly generated for three mammoths (Maly Lyakhovsky, Abyland and Adycha) as well as for Selwyn caribou A (Table S6.1 -S6.11). The macrofossils of the three Selwyn caribou samples showed a mixture of shrubs (genera Betula and Salix), lichen and mosses as the most dominant taxa, with grasses and forbs (e.g. Asteraceae, Caryophyllaceae) making up the remainder (Galloway et al., 2012). Selwyn C showed 44% lichen fragments.
The Yakutian bison faecal sample was dominated by vegetative remains of Poaceae and Cyperaceae (50%), wetland forbs (e.g. Comarum palustre and Menyanthes trifoliata) as well as Salix sp. and minor moss fragments ( . The Oyogar Yar horse sample was dominated by unidentified Cyperaceae remains and minor remains of Poaceae and several moss fragments . The previously studied macrofossils of the Yukagir mammoth faecal sample showed abundant poaceous remains together with Salix sp. and Carex sp. (van Geel et al., 2008). The herbaceous component was made up of plant remains from varying families, e.g. Asteraceae, Brassicaceae, Caryophyllaceae, Papaveraceae. Remains from several mosses were also identified, including Drepanocladus aduncus, Bryum sp., Entodon concinnus. The Cape Blossom mammoth macrofossils consisted of over 90% Carex sp., followed by Poaceae and a herbaceous component consisting of e.g. Minuartia rubella, Potentilla sp. and Cerastium/Silene sp. (van Geel et al., 2011b). Graminoids dominated the newly obtained data of the three mammoths Abyland, Adycha and Maly Lyakhovsky. This included poaceous vegetative remains, in the case of Abyland combined with one Carex sp. fruit and for Maly Lyakhovsky with the remains of a variety of mosses (e.g. Campylium stellatum, Cinclidium stygium, Drepanocladus sp., Warnstorfia sarmentosa).

DNA
Illumina sequencing resulted in 20.4 M read pairs for trnL and 16.4 M read pairs for nrITS. After quality filtering and clustering, 11.7 M reads were retained for trnL, 2.1 M reads for plant nrITS1, 2.2 M reads for plant nrITS2 and 5.0 M reads for fungal nrITS2. TrnL and fungal nrITS2 was successfully amplified in all samples while plant nrITS1 and nrITS2 was obtained for all but the Yukon horse, Cape Blossom mammoth and Selwyn caribou C.
The plant specific primers for the nrITS marker effectively amplified plant taxa, where 63.4% (nrITS1) and 70.4% (nrITS2) of the total OTUs were assigned to green plants ( Figure S15). Of the total OTUs, 3.8% and 7.3% were assigned to fungi, respectively. The remainder of the OTUs comprised green algae (Chlorophyta) and made up 6.6% of the total OTUs for nrITS1 and 19.4% for nrITS2. Across all samples, trnL produced 167 green plant OTUs, while 73 and 71 green plant OTUs were identified using plant nrITS1 and nrITS2, respectively (Tables S7 -S12). Per sample, trnL showed the highest number of green plant OTUs with on average 35.2 (range 12e74), while nrITS1 recovered on average 10.8 green plant OTUs (0e28) and nrITS2 12.5 (0e40) ( Table S16). For the fungal nrITS2, 88.2% of the total OTUs were assigned to Fungi, 11.6% to Viridiplantae and 0.2% was unidentified, while showing on average 20.2 fungal OTUs per sample (range 7e38; Tables S16). Read or OTUs counts were not correlated to the age of the samples for any of the markers.

Comparison of pollen, macrofossils and DNA data
Across DNA, pollen and macrofossil datasets, 311 plant taxa including 146 species, 150 genera and 63 families were identified ( Fig. 2; see Table S6.1-S6.11 for full recovered plant taxa information across all samples). With pollen analysis, 65 plant taxa were identified, while 84 plant and 5 lichen macrofossil taxa were found. DNA analysis resulted in 146 (trnL), 73 (nrITS1) and 71 (nrITS2) plant taxa. At all taxonomic levels, DNA analysis recovered the most unique plant taxa, with 16 families, 77 genera and 123 species (Fig. 2). However, unique taxa were also identified using both macrofossil (four families, 18 genera and 15 species) and pollen analysis (six families, 11 genera and one species). No species were recorded across all three proxies, while six genera (Androsace, Artemisia, Betula, Papaver, Rumex and Salix) and 14 families were shared in the DNA, macrofossil and pollen data. The biggest overlap of proxies was found between DNA and macrofossil results at the genus level (29 genera), while there was little overlap between the pollen and macrofossil results (three genera and two families).
Pollen and macrofossils could be identified to species level in 3.1% and 24.7% of the recovered taxa, respectively. For the DNA markers, 44.8% of the OTUs were identified to species level for trnL, while this was 70.9% and 78.2% for nrITS1 and nrITS2, respectively (Table S7, S9, S11). To illustrate the differences in taxonomic resolution between the three proxies as well as between the DNA markers, results of three plant families (Poaceae, Asteraceae and Cyperaceae) that were common to abundant in all 11 faecal samples are shown in Table 2. Taxa from these three families were found using all three proxies. For plant families where pollen could only be identified to the family level, macrofossils could in several cases be identified to genera within those families, and in rare cases to species level (e.g., Carex nardina and Carex dioica in the Cyperaceae family). The nrITS marker could identify species for taxa where trnL results were only identifiable to genus or family level. An example of this is the identification of the species Arctagrostis latifolia (100% identity) and Calamagrostis stricta (99.7%; Poaceae) using nrITS1, while trnL identification was only possible to the subtribe level (Agrostidinae). Similarly, where trnL identified Asteraceae subfamily Anthemideae, the nrITS marker found the species Artemisia scoparia and A. norvegica (both 100% identity). Unique Poaceae species (Koeleria asiatica, Festuca kolymensis) and Asteraceae species and genera (Artemisia gmelinii, Arnica, Saussurea) were, however, also found using trnL and this pattern was found throughout the whole dataset (Table 2 and Table S7).

Diet analysis
High congruence between the quantitative results of the different DNA markers was found for the Selwyn A and B caribou samples, with a dominance of shrubs (87e98%; Salix, Betula and various ericaceous taxa) and low abundance of forbs, graminoids and mosses (Fig. 3a). In contrast, the macrofossil results indicated high abundance of mosses, graminoids and lichen with only low amounts of shrubs. The combined diet reconstruction -based on DNA and macrofossils only -showed~75% shrubs with 10e15% mosses (Fig. 3b). Fungal nrITS2 results further identified low amounts of lichen, including Cladonia spp., Bryocaulon divergens and Stereocaulon saxatile (Table S13 e S14) that may have formed part of the caribou diet (0.3% of total fungal reads for Selwyn B and 0.1% for Selwyn A). For Selwyn caribou C, trnL showed a much higher amount of forbs (72%; mainly Asteraceae tribe Anthemideae and Sibbaldia procumbens) than the macrofossils (8%) or pollen (34%). The reconstructed diet differed from the other two caribou samples, consisting of 40% forbs and equal parts (15e20%) of shrubs (Salix), lichen and mosses.
Macrofossils of the Oyogas Yar horse were for >95% dominated by graminoids and this was also reflected in the trnL (85%) and nrITS1 (69%) data (mainly Eriophorum sp. and Dupontia fisheri respectively). The plant nrITS2 results, however, were dominated by mosses (73%). The diet reconstruction showed a dominance of graminoids (65%) with 20% mosses and equal amounts of shrubs  Table 2 All taxa recorded of three plant families (Poaceae, Asteraceae and Cyperaceae) that were common to abundant in all 11 faecal samples in DNA (trnL, nrITS1 and nrITS2), macrofossils and pollen analyses. The numbers represent the number of samples in which that specific taxon was found. and forbs (8%). The diet of the other, much older, Yukon horse contained a lower fraction of graminoids (28%) and, instead, was dominated by forbs (on average 60%; consisting of Braya rosea and Asteraceae tribe Anthemideae). Tree and shrub taxa were only identified in the macrofossil results for this sample. The Yakutian bison sample consisted on average of 48% forbs (mainly Cicuta virosa) and 25% each of graminoids (Eriophorum, Carex) and shrubs (Salix). The Adycha and Maly Lyakhovsky mammoth samples showed highly similar results from both proxies and the reconstructed diets consisted almost exclusively of graminoids (Fig. 3b). Graminoids in the Adycha sample consisted for >75% of Puccinellia sp. based on DNA analysis, while many species of Poaceae (including abundant Deschampsia cespitosa and Alopecurus magellanicus), as well as Carex sp. and Eriophorum sp. were found in the Maly Lyakhovsky sample. Mosses were found to be relatively abundant in this sample according to nrITS2 results (33%; mainly Polytrichastrum alpinum), while much lower percentages of mosses were found in nrITS1, trnL or macrofossil results. The three other mammoth samples showed a higher contribution of forbs to their diet, often with the DNA results of the different markers showing one species dominating the assemblage. For the Abyland mammoth this dominant species was Anemone patens, while in the Yukagir mammoth sample Myosotis alpestris was abundant. The Yukagir mammoth was the only one of the mammoth samples showing relatively abundant (on average 34%) shrubs (Salix) in its diet. In the Cape Blossom mammoth, graminoids made up >75% of macrofossils, while the trnL results showed 28% graminoids, consisting mainly of Carex. In the trnL results forbs were abundant (71%) and consisted for the largest part of Chamaenerion angustifolium and Asteraceae tribe Anthemideae.

Habitat types
We combined species and genus-level plant identifications from all proxy results to reconstruct the habitats in which the last meals of the studied megafauna were consumed ( Fig. 4; Table S17 for all plant species information).
Identified plant species in the Selwyn caribou A and B samples provided a range of habitats including wetland, woods and a large component of arctic-alpine tundra (e.g. Arctous alpina, Anemone richardsonii, Carex podocarpa and Pyrola grandiflora) along with taxa typical for mountainous/rocky habitats (e.g. Rhodiola integrifolia). The Selwyn caribou C sample similarly contained many species typical for arctic-alpine tundra but also included a large component of species typical for snow patches (e.g. Ranunculus nivalis, Ranunculus pygmaeus, Oxyria digyna).
The reconstructed habitats of the Holocene Oyogas Yar horse and Yakutian bison consisted mainly of wetlands, including marshes and river/lake sides. For the Oyogas Yar horse this included Eriophorum sp., Caltha palustris and Comarum palustre typical for marshes and e.g. Arctagrostis latifolia and Arctophila fulva/Dupontia fisheri from water sides. The Yakutian bison showed numerous Carex species, Menyanthes trifoliata, Epilobium palustre and Hippuris sp., all indicative of marshy wetland conditions as well as e.g. Endocellion sibiricum and Epilobium palustre typically found along rivers or ponds.
The Cape Blossom and Maly Lyakhovsky mammoth samples also included wetland components, with in the case of Cape Blossom e.g. Caltha palustris and species of Carex and for Maly Lyakhovsky Eriophorum sp., Caltha palustris as well as several grass species (Pleuropogon sabinei, Arctophila fulva). Moss species in the Maly Lyakhovsky mammoth further provided evidence of a wet, marshy environment (e.g. Drepanocladus sordidus, Cratoneuron filicinum, Warnstorfia sarmentosa and Dicranum bonjeanii). However, in contrast to the Holocene horse and bison, both these mammoth samples also included species indicative for dry meadows and, in the case of Cape Blossom, steppe (Festuca kolymensis and Artemisia gmelinii). Several true steppe species were also found in the Abyland mammoth (Silene samojedorum, Carex duriuscula, Artemisia scoparia) and Yukagir mammoth samples (e.g. Eritrichium sericeum, Festuca kolymensis, Phlox hoodii). Other taxa in both samples were indicative for dry meadows (e.g. Anemone patens and Cerastium maximum for Abyland and Myosotis alpestris and Eremogone capillaris for Yukagir). Furthermore for the Abyland mammoth, several species typical for wet meadows were identified (e.g. Sanguisorba officinalis, Stellaria borealis), while for the Yukagir mammoth a component of gravelly slopes and mountainous/rocky habitat was found (e.g. Smelowskia alba, Oxytropis deflexa, Rhodiola rosea). The Pleistocene Yukon horse also showed a last meal consisting of a mix of taxa from different habitats with species typically found in wet meadows and wetlands (Alnus incana, Juncus alpinoarticulatus) as well as dry meadow and steppe (Bromus pumpellianus, Artemisia gmelinii). The habitat for the Adycha mammoth consisted of meadows (e.g. Deschampsia cespitosa, Bromus pumpellianus) as well as a large component of saline meadow (Puccinellia sp.).

Comparison of proxies
Out of the three proxies used in the present study (DNA, pollen and macrofossils), DNA recovered the highest number of unique taxa at all taxonomic levels (Fig. 2). This is likely caused by the large amount of vegetative remains in the faecal samples that could not be identified beyond the family or genus level using macrofossil or pollen analysis. DNA analysis does not depend on the season when plants carry seed, fruit or pollen and allows identification of many taxa to the species level irrespective of their developmental stage. We also used primers for multiple marker regions (trnL, nrITS1, nrITS2), each identifying unique taxa and increasing overall taxonomic resolution (Tables S7 -S12).
In comparison to pollen from sediments, pollen spectra from our faecal samples were not very diverse (Jorgensen et al., 2012;Parducci et al., 2015;Pedersen et al., 2013). This could be because lake sediments accumulate pollen over a much larger spatial and temporal scale than faeces do. We took all the samples for our analyses from the middle of the faeces and thus caught only a snapshot of airborne pollen (i.e., sticking on ingested vegetation), mixed with pollen coming from ingestion of inflorescences. The taxonomic overlap between pollen and DNA, as well as between pollen and macrofossils was surprisingly low, and we instead found the highest overlap between DNA and macrofossil results. This is likely because both of these proxies are providing a local signal (showing the food choice of the animal) while the pollen analysis is influenced by accidental intake of pollen sticking to ingested vegetation as well as pollen from species producing high amounts of pollen (e.g. Jorgensen et al., 2012).

Metabarcoding detection gap
We use the term 'metabarcoding detection gap' here for taxa that were not retrieved in the DNA results (trnL or nrITS) but were present in the macrofossil and/or pollen records. In total, the metabarcoding detection gap consists of 12 families, 32 genera and 16 species (Fig. 2). Many of these taxa are very rare in the pollen or macrofossil counts, with most of them found in only one sample and in low abundance. For pollen this includes single identified spores and pollen of Botrychium sp. and Populus sp. in the Selwyn caribou samples, and Epipactis sp., Persicaria sp. and Thalictrum sp. in the mammoth samples. For such rare pollen grains it seems likely they were only present as pollen while being (very) rare in the M. Polling, A.T.M. ter Schure, B. van Geel et al. Quaternary Science Reviews 267 (2021) 107084 consumed vegetation. A lysis step with mechanical bead beating is necessary to break the exine of pollen grains and release the inner DNA (Polling, 2021). Since these steps have not been used here, this could explain the absence of these taxa from the DNA results. On top of this, pollen contains very little DNA that is hard to amplify even if present in high numbers (Parducci et al., 2005). Similar to proxy comparison studies on lake sediments (e.g. Parducci et al., 2019), we find that DNA from pollen contributes very little to the total DNA signal in faeces.
There are also taxa that were found as pollen with high relative abundance, while being very rare or absent in the other proxies. This includes, for example, pollen of the family Pinaceae which account for up to 30% in the caribou samples. Pinaceae pollen is often overrepresented in pollen records from the (sub)Arctic because they are high pollen producers and their pollen is spread over large distances (Aario, 1940). The genus Artemisia reached up to 40% in some pollen records (Selwyn caribou B; Table S6.2), yet it is very rare in both DNA and macrofossil results. Unfortunately, using trnL, the genus Artemisia cannot be distinguished from other genera from the subfamily Anthemideae (Anthemis, Achillea, Chrysanthemum, Tanacetum etc.). This subfamily was relatively abundant in Selwyn caribou C, Cape Blossom mammoth and the Yukon horse, and it cannot be resolved whether these reads actually belong to Artemisia. Rare fragments of Artemisia in the macrofossil records were only recorded in the Yukon horse and Selwyn caribou C samples. Part of this discrepancy can be explained by differential preservation, since macrofossils of Artemisia such as seeds or fruits (achenes) deteriorate rapidly and are therefore rarely recovered (Anderson and Van Devender, 1991;Birks, 2007). Other studies on DNA metabarcoding of Pleistocene megafaunal faeces also found high amounts of Artemisia pollen but very low abundance with DNA or macrofossils from the same samples (e.g. Kolyma rhinoceros and Finish Creek mammoth; Willerslev et al., 2014). For caribou, where in all three samples Pinaceae and Artemisia pollen is common to abundant, it is furthermore known that they do not actively select Artemisia and avoid Pinaceae (Denryter et al., 2017;Jung et al., 2015). These records are therefore interpreted as the results of accidental uptake of pollen sticking to selected plant taxa.
In the macrofossil data, we detected many taxa that were represented by one seed or plant fragment (e.g. Antennaria sp., Draba sp., Sagina sp., Hedysarum sp., Lysimachia sp.) and many of these are part of the metabarcoding detection gap. Furthermore, fragments of various mosses were exclusively found as macrofossils (e.g. Calliergon sp., Plagiomnium sp., Rhizomnium sp., Thuidium sp. and the spikemoss Selaginella sp.). It should be noted that DNA reference libraries are still far from complete, and this may be especially true for Arctic Russian moss species. Therefore, some of the species found as macrofossils may not be recoverable using DNA at this moment. One such example is the moss Cinclidium stygium for which no nrITS sequence is currently available in the NCBI Genbank. Apart from this, the expected amplicon size for bryophytes using the plant-specific nrITS primers in our study is > 500 bp (Cheng et al., 2016), which may cause some species to be missed due to the 600 bp restriction using Illumina sequencing. Furthermore, even though we applied a multi-locus approach, DNA primer mismatch in both trnL and nrITS could have occurred. Many Selaginella species for example show 5 mismatches in their DNA barcodes with the trnL-h as well as the ITS4 reverse primers used in this study. Lastly, DNA of plant fragments may have been simply too degraded to be amplified by any of the DNA markers.

Morphology detection gap
A 'morphology detection gap' is designated here as all taxa that are missing in either the pollen or macrofossil record but were found in the DNA results. In total, the morphology detection gap for the studied faecal samples consists of 16 families, 77 genera and 123 species (Fig. 2). The biggest factor contributing to many of the taxa only found as DNA is the higher taxonomic resolution that is achieved using DNA (although it depends on the percentage of identity used whether taxa identified by DNA are assignable to either, e.g., genus or species level). There are, however, a number of other factors that may determine the taxa in the morphology detection gap. First, many taxa only found with DNA were very rare (<0.1% of the relative amount of reads) and only recorded in one sample. These taxa could have either been very minor diet items or taxa that were not targeted (i.e. accidental intake), which were present in such low quantities that they may have been missed with the macrofossil or pollen analyses. Accidental intake could also explain the presence of several species in the DNA results of the caribou samples of which the ingestion of high amounts may be toxic (e.g. Pedicularis capitata, Oxytropis deflexa; Denryter et al., 2017). Secondly, some plant taxa may be more affected by the digestive processes than other plant taxa, causing them to be unrecognizable as macrofossils while still being recoverable using DNA. Lastly, despite extensive reference collections for pollen and macrofossils, identification may still be somewhat subjective with regards to morphologically very similar taxa. This is less the case for DNA using reference libraries that allow more objective identifications.
Taken together, this explains the abundance of some taxa in DNA results even though they were missing in the other proxies. One example is the willowherb family Onagraceae for which Chamaenerion angustifolium and Epilobium palustre were found in DNA of seven of the samples studied here. Rare Onagraceae pollen were only found in the Cape Blossom mammoth (van Geel et al., 2011b). Although pollen from insect-pollinated plants are always underrepresented in faecal samples, we identified abundant Chamaenerion angustifolium in the DNA results of the Cape Blossom sample.
No macrofossil remains of Onagraceae were recorded in any of the samples, and this is likely because vegetative Onagraceae remains are very hard to recognize due to their ambiguous morphology (Anderson and Van Devender, 1991;Grímsson et al., 2012). Similarly, the forget-me-not family Boraginaceae is only recovered using DNA. It was especially abundant in the last meal of the Yukagir mammoth (Myosotis alpestris and Eritrichium sericeum). An additional species (Mertensia paniculata) was identified in the faecal samples of the caribou and the Cape Blossom mammoth, yet no remains of Boraginaceae were found in either pollen or macrofossil analyses of any sample. Pollen grains of members from this family are particularly small (5e7 mm) and could potentially be overlooked during analysis while vegetative macrofossil remains are hard to identify. Macrofossils of Boraginaceae and Onagraceae have not been recorded in any other mammoth faeces, even though they were recorded in high abundance in DNA data (e.g. Finish and Drevniy Creek mammoths as well as Yukagir bison; Willerslev et al., 2014). These examples show the added value of DNA analysis and indicate that vegetative plants of these families may likely have formed part of the diets of the studied megafauna.

Comparison of plant DNA markers
Our application of multiple DNA markers on megafaunal faecal samples reveals the added value of a multilocus approach. The three samples for which no plant nrITS results were obtained were of very different ages (±2.7, ±14.4 and ± 30.9 kyr BP), while older samples did produce plant nrITS amplicons (Abyland and Maly Lyakhovsky mammoths). While nrITS amplicons were found in all samples, for the three samples where no plant OTUs were found, these were all either derived from contamination, algae or fungi. Fragments of DNA up to 500 bp have been recovered from permafrost preserved sediments as old as 400 kyr (Lydolph et al.,Fig. 4. Habitat reconstruction of megafaunal species based on integrated (pollen, macrofossils, DNA) species and genus resolution data. The samples were sorted according to their age and the average calibrated age of each sample is indicated between brackets.
M. Polling, A.T.M. ter Schure, B. van Geel et al. Quaternary Science Reviews 267 (2021) 107084 2005). Therefore, it most likely depends on the conditions in which the specimens were preserved over time that determined whether or not these long fragments can be recovered. Some samples may have inadvertently been (partially) thawed at some stage, causing longer DNA fragments to be degraded, while the shorter and more stable trnL was not affected. Most unique taxon identifications of the nrITS marker come from increased taxonomic resolution of several families and genera that show relatively low taxonomic resolution in the other proxies. This includes, for example, the genus Carex for which six unique species were found and the family Poaceae for which 11 unique species were identified with nrITS (Table 2). Furthermore, nrITS identified a larger variety of mosses than trnL, which is likely the result of the very short sequence length of the bryophyte P6 loop (±22 bp) obtained using the trnL g and h primers. These primers were not designed for bryophytes, and the recovered length often prevents sufficient taxonomic detail (Epp et al., 2012;Soininen et al., 2015). Nevertheless, many unique plant species were found using trnL, which could be the result of the more complete reference libraries available for trnL compared to nrITS. Many nrITS reference sequences in the NCBI Genbank database do not represent the complete marker region (e.g. Pleuropogon sabinei and Ranunculus nivalis with partial nrITS2 sequences) or are simply missing altogether because no reference sequences have been deposited yet. This is, for example, seen for species in the genus Puccinellia where not all Russian endemics have been sequenced (missing e.g. Puccinellia manchuriensis, P. byrrangensis, P. jenisseiensis), and this might also explain why we find P. vahliana (nowadays a western Arctic species) in nrITS results. Apart from that, the shorter and more stable trnL P6 loop produced results for the samples that did not produce any results from nrITS, which further explains the number of unique trnL identifications.

Diet analysis
The diet analysis of Selwyn A and B showed that shrubs are highly dominant in the summer diets of caribou, which is in agreement with known diets of summer foraging caribou that consists of deciduous shrubs along with reindeer lichen and fungi (Bergerud, 1972;Boertje, 1984). Lichen were observed using macrofossil and fungal nrITS2 analysis, and were also indirectly detected with plant DNA by the presence of lichen phycobionts in the plant nrITS2 results (e.g. Asterochloris, Symbiochloris and Trebouxia spp.), only found in the Selwyn caribou samples (Table S18). Trebouxia is the most common phycobiont in extant lichen, while Asterochloris is mainly associated with lichen of the families Cladoniaceae and Sterocaulaceae (Pino-Bodas and Stenroos, 2020). Both families were also identified using fungal nrITS2 (Table S13 e S14), providing further support that the caribou ate lichen. The diet of modern caribou is well studied and for many Arctic plant species it is known whether they are either "selected", "neutral" or "avoided" based on observations of foraging caribou (Bergerud, 1972;Denryter et al., 2017). An average diet of modern caribou was found to consist of 78% selected, 15% neutral and 7% avoided species (Denryter et al., 2017). For Selwyn A and B, "selected" plant taxa made up >85% of relative abundance of all DNA markers, while "avoided" taxa made up <5% (Table S19). This is in contrast to macrofossil results that showed up to 21% avoided taxa, mainly from mosses. Selwyn caribou C showed a large component of diet items that were of unknown (43%) and neutral diet preference (44%), with only minor (11%) selected plant taxa (Table S19). This points to a somewhat atypical summer diet for this caribou when compared to modern caribou preferences and may suggest a different vegetation composition in its habitat.
The diets of nearest living relatives for Holocene bison and horse are well studied. While horses are typical grazers nowadays with diets consisting >75% of graminoids (Mendoza and Palmqvist, 2008), this has not always been the case. Several studies have shown that prehistoric horses had mixed grass-browse diets, especially in winter when grasses were harder to access (Kaczensky et al., 2017;MacFadden et al., 1999). The diet of the Holocene Oyogas Yar horse (Equus cf. lenensis) is typical for a grazer, with the main component being identified as graminoids. The Pleistocene Yukon horse (Equus lambei), however, consumed mostly forbs. The season of death could not be determined for the Oyogas Yar horse (although it could be spring to summer due to relatively high amount of Cyperaceae pollen), while for the Yukon horse it was determined as winter (Harington and Eggleston-Stott, 1996;Harington, 2002). This could explain why grasses made up only 28% of the total diet for the Yukon horse (Fig. 3b). It is likely that snow covered much of the grass cover, forcing the horse to focus on other available dietary items or that grasses were simply less abundant or of lower nutritional value (Savage and Heller, 1947). The now extinct steppe bison (Bison priscus) was closely related to modern bison (Bison bison (Linnaeus, 1758); Marsolier-Kergoat et al., 2015). While modern bison are often thought of as grazers feeding for the majority on graminoids, their summer diets are more variable, consisting on average of 44% grass, 38% forb, 16% shrubs and <2% sedge (Leonard et al., 2017). This is similar to the DNA results of the Yakutian bison studied here, where forbs and shrubs are important components. Pollen of undifferentiated Apiaceae (identified by nrITS as Cicuta virosa) were also relatively abundant in this sample (9%) indicating ingestion of inflorescences. This may indicate that the Yakutian bison had its last meal in summer and was a mixed feeder that did not rely solely on grasses. The 'warm season' (spring/summer) was also identified as the most likely season of death for the Yakutian bison by  and Boeskorov et al. (2016). The >52 kyr old bison (Bison sp.) studied by Willerslev et al. (2014), similarly showed a high abundance of forbs and shrubs (80%), although no season of death was identified for this sample. Lastly, the abundance of poisonous Cicuta virosa (water hemlock) in nrITS, and also recognized to lower taxonomic resolution in trnL, pollen and macrofossils, possibly indicates that the Yakutian bison died of hemlock poisoning (Jacobson, 1915).
The last meals of the Maly Lyakhovsky and Adycha mammoths consisted almost exclusively of graminoids. Some of these grasses can grow to considerable size (75e100 cm) and may have provided sufficient nutritional value for mammoths (e.g. Bromus pumpellianus, Deschampsia cespitosa, Dupontia fisheri). Furthermore, the genus Puccinellia which was identified as the main component in the Adycha mammoth last diet, includes several species that are commonly grown for hay making for cattle in modern day Yakutia, Russia (Gavril'eva, 2011). The other mammoths studied here had much lower relative amounts of graminoid DNA, or barely any in the case of the Yukagir mammoth. The last diet of the previously studied Mongochen mammoth as reconstructed using macrofossils consisted mainly of mosses, forbs and only minor grasses and shrubs while DNA results showed dominance of 98% graminoids (Kosintsev et al., 2012a;Willerslev et al., 2014). The authors suggested that the underrepresentation of graminoids in the mammoth faeces could be the result of the digestive processes breaking down the poaceous tissues, although this is not supported by our finding of graminoids being dominant in the other mammoth faeces. It does, however, hold for forbs which are underrepresented in macrofossil and pollen results as compared to our DNA data, which has also been found in previous studies (e.g., Kosintsev et al., 2012b;Willerslev et al., 2014). The last meals of the Abyland and Cape Blossom mammoths may not have consisted solely of graminoids as suggested by the macrofossil analysis, but M. Polling, A.T.M. ter Schure, B. van Geel et al. Quaternary Science Reviews 267 (2021) 107084 supplemented with Anemone patens (Abyland) and various other forbs, while shrubs and Chamaenerion angustifolium were consumed by the Cape Blossom mammoth. The abundance of Salix sp. and Boraginaceae (Yukagir) provides further evidence for the diversity in mammoth diets. Another potential explanation for the differing diets may be sought in the different seasons of death, which could be determined for three of the mammoth samples studied here. The season of death of Maly Lyakhovsky mammoth was determined as late summer to early autumn (Grigoriev et al., 2017), while for both Yukagir and Cape Blossom mammoths autumn to early spring was suggested (Mol et al., 2006;van Geel et al., 2011b). A recent study on molar enamel profiles found that mammoths may have had seasonally different diets, shifting between browse and grasses (Uno et al., 2020). Also in the previously published Mongochen mammoth that died mid-summer and for which DNA, pollen and macrofossil results were analysed, the last diet was interpreted to be dominated by graminoids (Kosintsev et al., 2012a;Willerslev et al., 2014). This limited amount of data suggests that warm season diets of mammoth may have been dominated by graminoids (Maly Lyakhovsky, Mongochen), while they relied on various other food sources in the cold season (Cape Blossom, Yukagir). However, more multiproxy data is needed to support this hypothesis.
In some of the faecal samples studied here, mosses were identified in abundance either in the macrofossils (Selwyn caribou A and B) or in DNA results (nrITS2; Oyogas Yar horse and Maly Lyakhovsky mammoth) while being nearly absent in the other proxies. The relative abundance of mosses in the macrofossils of the caribou faeces is probably the result of accidental ingestion when the caribou were foraging low on the ground for dwarf shrubs and lichens. The moss species that was abundantly found with nrITS2 in the Oyogas Yar and Maly Lyakhovsky sample was Polytrichastrum alpinum which was detected only as rare fragments in the macrofossil remains of these samples. Potentially the primers used to amplify the nrITS2 region caused preferential amplification of this type of moss. Although abundant moss fragments have been identified in macrofossils from several mammoths (Kosintsev et al., 2012a(Kosintsev et al., , 2012b, and are sometimes found in caribou faeces (Denryter et al., 2017), they are unlikely to have formed a major part of the diet for any of the extinct and extant mammals studied here because of their low nutritional value.

Habitat types
The reconstructed habitat for Selwyn caribou A and B corresponds well with the known current habitat of these animals in the Selwyn Mountains in Northwest Territories, Canada. The habitat for these two samples consists of elements from both downslope boreal forest and its wetlands, along with upslope alpine tundra. It is important to note that the two most dominant diet items as identified by DNA (Salix and Betula), are not included in the habitat analysis because neither of them could be identified beyond the genus level. Species from these genera have varying habitat preferences and therefore the genus level identifications did not provide enough information to infer the habitat, the only exception being rare Salix alaxensis in Selwyn Caribou B which typically grows in forested habitat along streams and lakes (Boufford et al., 2016). The only Betula species found in the Selwyn Mountains are B. glandulosa (dwarf birch, shrub) and B. papyrifera (canoe birch, tree), with the dwarf birch being far more common (Galloway et al., 2012). The habitat reconstructed for Selwyn caribou C may indicate that the faeces in this sample was deposited by caribou that consumed a meal nearer to the ice patch.
When many megafauna species disappeared at the end of the Pleistocene, the Holocene vegetation shifted significantly to become a more waterlogged environment with mossy and shrubdominated tundra and deciduous forests (Edwards et al., 2005;Guthrie, 2001). The habitats reconstructed for the Holocene horse and bison reflect this mesic environment. Previous studies on these samples, however, indicated dry steppe-like conditions based on pollen and macrofossils due to the abundance of Poaceae remains (Boeskorov et al., 2016;Gravendeel et al., 2014;. However, here we find that the species composition of Poaceae for both samples included Dupontia fisheri, Arctophila fulva and Arctagrostis latifolia, all species typical for wetland habitats. Similar to the results for the Holocene Yakutian bison, modern bison (Bison bison) are known to prefer sedge marshes over other habitat types (Belanger et al., 2020 and references therein). Our results show that both horse and bison are not strictly graminoid grazers, but utilize wetlands in their habitat as well. This is also confirmed by the habitat reconstructed for the Pleistocene Yukon horse studied here, that showed a mixed environment of wetland and dryer meadows and steppe. Furthermore, a recent study on dental micro-and mesowear of horse and steppe bison also found that both were likely mixed feeders, instead of obligate grazers (Kelly et al., 2021).
Mixed environments were also identified for the mammoth samples, although with varying degrees of wetland components. The oldest mammoth studied here, Maly Lyakhovsky, showed many species typical for a marshy environment. This is in contrast to the Abyland mammoth that was collected from the same geographic area (North Sakha republic, Russia) and of similar age, that showed a much larger steppe and dry meadow habitat. This relatively large steppe component was also found for the Yukagir mammoth, although for this mammoth it was mixed with many plants typically found on gravelly slopes and mountainous areas. This may indicate that mammoths may have been versatile in their diets, adapting to the various habitats that were available. This is further supported by the habitat reconstructed for the Adycha mammoth, which shows that saline meadows were present and utilized by mammoths as well. For the Cape Blossom mammoth, no nrITS results were obtained which hampers the habitat reconstruction. However, with the other proxies a habitat similar to the Maly Lyakhovsky mammoth was reconstructed, with marshy wetland and surrounding wet meadows, intermixed with steppe and dry meadow. The variety of diets obtained from different habitats supports the idea that the 'mammoth steppe' was a mosaic of habitats instead of a homogeneous vegetation type (e.g. Zazula et al., 2007). Furthermore, the specific plant species mixture identified for these mammoths is not found in any modern habitat type, pointing to non-analogue plant communities (Williams and Jackson, 2007). Our results also indicate that mammoths were not exclusively grazers, but rather opportunistic mixed-feeders.

Conclusions
We integrated multilocus plant DNA, macrofossil and pollen analysis to obtain detailed reconstructions of megafaunal diets and habitats. We found most plant species in faecal samples uniquely using DNA, some of which abundantly so. This could be because of the large number of vegetative plant remains in the faeces which have become unidentifiable for macrofossil analysis due to masticatory and digestive processes. Unique plant taxa were, however, also found using both macrofossil and pollen analysis. We further show that relatively long nrITS fragments can be amplified from faecal samples as old as 28,610 14 C BP and that these help to increase species resolution for many plant families (e.g. Asteraceae, Cyperaceae and Poaceae) as well as mosses that could not be retrieved using trnL.
We could accurately reconstruct the known diet and habitat of M. Polling, A.T.M. ter Schure, B. van Geel et al. Quaternary Science Reviews 267 (2021) 107084 modern and late Holocene caribou (i.e. abundant shrubs from an arctic alpine tundra) and extended this approach to Holocene and Pleistocene megafauna including horse, steppe bison and woolly mammoth. These reconstructions showed that the Holocene steppe bison and horse were not strict grazers but rather mixed feeders that were foraging in a marshy wetland environment. This result is in sharp contrast with previous reconstructions that suggested dry steppe-like conditions for these samples. We further find that the five Pleistocene mammoths studied here had very different last meals obtained from a variety of habitats including wetland, wet meadow, gravelly slopes, saline meadow and steppe. This confirms the presence of a mosaic of habitats in the Pleistocene landscape often referred to as the 'mammoth steppe' that mammoths could fully exploit due to a high flexibility in their diet choice.

Data availability
All raw read data are available at the European Nucleotide Archive (ENA) with the study accession number PRJEB44352 (sample metadata, including sample names and primer-adapter sequences, is available in Table S20). Processed read data is available in the supporting information (Tables S7 -S14).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.