Deeply divergent archaic mitochondrial genome provides lower time boundary for African gene flow into Neanderthals

Ancient DNA is revealing new insights into the genetic relationship between Pleistocene hominins and modern humans. Nuclear DNA indicated Neanderthals as a sister group of Denisovans after diverging from modern humans. However, the closer affinity of the Neanderthal mitochondrial DNA (mtDNA) to modern humans than Denisovans has recently been suggested as the result of gene flow from an African source into Neanderthals before 100,000 years ago. Here we report the complete mtDNA of an archaic femur from the Hohlenstein–Stadel (HST) cave in southwestern Germany. HST carries the deepest divergent mtDNA lineage that splits from other Neanderthals ∼270,000 years ago, providing a lower boundary for the time of the putative mtDNA introgression event. We demonstrate that a complete Neanderthal mtDNA replacement is feasible over this time interval even with minimal hominin introgression. The highly divergent HST branch is indicative of greater mtDNA diversity during the Middle Pleistocene than in later periods.

Supplementary Figure 2. Collagen δ 13 C range of steppe herbivores (in blue from the top right: horse, bison, red deer and reindeer) compared to the δ 13 C values of two red deer (in green on the left) from the same stratigraphic unit of HST suggesting for the latter a light forested environment (Supplementary Table 1). Error bars represent the standard deviation of the average δ 13 C ‰ values.    Figure 9. Probability that all Neanderthal mitochondrial lineages originated from an admixture event with a BMH population, as a function of the admixture rate, for a piece-wise constant Neanderthal effective population size history that approximates the history estimated from the Bayesian skyline method.
Supplementary Figure 10. Maximum parsimony tree built in MEGA6 with the HVRI regions of: HST, 20 Neanderthals, three Denisovans and rCRS. The tree was tested with 1000 bootstrap iterations and complete deletion. The accession numbers of the HVRI published sequences are reported in graphs next to individual name. 3668" and the find-number was written directly onto the bone after the excavation. Because of its archaic morphology the excavators suggested already in their first publications that the finding was a skeletal element of a Neanderthal individual 6-8 . It was not until several decades later that the specimen was published and morphologically described in detail 9 . The absence of the bone's epiphyses, the presence of gnawing marks of large carnivores on both ends of the long bone 10 , as well as the absence of other remains from archaic humans in the cave, suggest that the femur was brought inside Hohlenstein-Stadel by carnivores or modified by carnivores after the initial deposition of the skeleton. The femur was discovered at the entrance of the cave in a horizon that was correlated with an inner layer of the cave, known as the "Black Mousterian", as described by the archaeologists during excavations in the 1930s. Inside the cave the "Black Mousterian" was found at the lowermost of the stratigraphic sequence associated with Middle Paleolithic artifacts. During recent excavations on the forecourt of the cave, a displaced layer of black sediment was discovered (layer BG) 11 and radiocarbon dated to >50,000 14 C years BP (ETH-38795) 12 . This displaced horizon from the forecourt can be correlated with the layer of the "Black Mousterian" from inside the cave as well as with the horizon where the HST femur was found. In fact, layer BG was discovered only a few meters away from the spot where the hominin remain was unearthed and showed clear indication of periglacial movements of sediments from inside the cave to the forecourt. Is suggested that the black layer containing the HST femur was also not in situ but displaced from the deeper chamber of the cave. A substantial amount of microfauna was discovered in the forecourt BG horizon, suggesting moderate climatic conditions that were also confirmed by the presence of a limited number of arctic faunal species 11 . The composition of small mammals is unknown in southwestern Germany during the Marine Isotope Stage 3 (MIS 3), but it is also unlikely to have originated during the Eemian interglacial period (MIS 5e). Therefore it is more plausible that this layer formed during one of the moderate interstadial periods at the beginning of the last glaciation (MIS 5c or 5a) 9,13 .

Supplementary Note 2: Isotopic results
Collagen extraction was performed at Tübingen University and followed the method from Longin 14 and described in Bocherens et al 15 . Isotopic measurements of δ 15 N and δ 13 C were done using an elemental analyzer NC2500 connected to a Thermo Quest Delta + XL isotopic ratio mass spectrometer. The isotopic ratios are expressed using the "δ" (delta) value as follows: δ 13 C = [( 13 C/ 12 C) sample /( 13 C/ 12 C) reference -1] ×1000‰, δ 15 N = [( 15 N/ 14 N) sample /( 15 N/ 14 N) reference -1] × 1000‰. The standard for δ 13 C is the internationally defined marine carbonate V-PDB. For δ 15 N the atmospheric nitrogen (AIR) is used. Analytical error based on laboratory standards is ±0.1‰ for δ 13 C ! values and ±0.2‰ for δ 15 N. The chemical preservation of collagen is expressed through the atomic ratio of C coll :N coll , whose acceptable range of variation is 2.9-3.6 16 , while the nitrogen content (N coll ) should be above 5% 17 . The attempt to directly radiocarbon date HST femur resulted in an age of 34,130-34,880 years cal BP 570 ± 190 14 C years). This date is inconsistent with the estimated end of the Mousterian around 40 ka 18 and with the femur's stratigraphic position (see Supplementary Note 1). Stable isotopic composition in the collagen of HST femur (HST 17) differed from that of late Neanderthals 1 (Supplementary Fig. S1 and Table S1). In fact, the lower δ 13 C and δ 15 N values in the hominin specimen correspond to a different ecology from the one of late Neanderthals in western-central Europe. We can exclude contamination as the responsible factor for the obtained stable isotope values. While few percent of modern carbon could explain the measured radiocarbon date, this would not affect notably the isotopic composition. Furthermore, we analyzed two faunal remains (HST 21 and HST 26) from the same stratigraphic unit where the HST femur was discovered. Collagen from both specimens was extracted and radiocarbon dated to >49,000 14 C years (HST 26) and 46,975 ± 1000 14 C years (HST 21) (Table S1). In this time range, both dates can be considered beyond the limit of radiochronometric dating, where a minimal proportion of contamination with modern carbon could result in a wrongly finite date. Both specimens were initially identified as red deer on a morphological basis and ZooMS analyses 19 performed on HST 26 collagen confirmed the species assignment. Moreover, collagen of both cervid remains provided δ 13 C isotopic values distinctively lower than individuals of the same species grazing in open steppic habitats 20 (Supplementary Table S1, Supplementary Fig. S2). Isotopic evidence thus suggests that both the hominin and the two deer from Hohlenstein-Stadel Black Mousterian lived in a more forested rather than a steppic environment typical for the Late Neanderthals in northwestern Europe.

Supplementary Note 3: Reference mapping bias
In order to evaluate the impact in the consensus reconstruction of the four different mapping references, we used the software MUSCLE 21 to first generate a multiple genome alignment of the four consensus with 17 Neanderthal mtDNAs 22-27 and rCRS 28 as out-group. We excluded the individual Goyet Q57-1 from Rougier et al. 27 in the phylogeny and further analyses because around 2% of mtDNA positions were unassigned. We then built a phylogenetic tree with the maximum parsimony method (SPR algorithm) in MEGA6 29 . A total of 16,255 positions were considered in the phylogeny with complete deletion and 1000 replicates as bootstrap support ( Supplementary Fig. S4A). Looking at the tree topology, all four consensus variants are placed on a basal mtDNA Neanderthal lineage but with different branch lengths. In particular, when the two modern human mtDNA sequences (rCRS and RSRS) are used as mapping references, the resulting consensus sequences show shorter phylogenetic branches. We interpret this phenomenon as a reference bias that facilitates contaminant reads to map better than the endogenous ones, producing consensus sequences closer to the modern human references. Instead, when using Neanderthal references (Feldhofer 1 and RNRS) that are phylogenetically more similar to the endogenous mtDNA we observe a greater amount of derived positions. After visual inspection (Methods section) we could confirm that all derived positions observed in the RNRS consensus are indeed of endogenous origin and thus this sequence was used for phylogenetic and mtDNA diversity analyses. We further observed that 18 of the 19 inconsistent positions across the four consensus sequences are placed in the D-loop (rCRS pos. 16023-577) where the most polymorphic regions of the mtDNA are located (HVRI and HVRII). We then removed the D-loop from the alignment and constructed an additional maximum parsimony tree with the same parameters described above (15,345 positions) ( Supplementary Fig. S4B). As expected, the reference bias highlighted previously was overcome. Thus we used the most conservative coding region consensus to perform BEAST and additional phylogenetic analyses.

Supplementary Note 4: Mutation rate mtDNA
The enlarged dataset of 18 complete Neanderthal mtDNAs provide us with the opportunity to create for the first time to our knowledge a skyline plot for Neanderthal population assuming panmixia. We used only the mtDNA coding region and not the whole molecule for BEAST analyses because the vast majority of unassigned nucleotides in the published Neanderthal mtDNAs are located in the D-loop. This region is the most variable of the mtDNA with the highest mutation rate. Having the D-loop poorly covered but using a mutation rate for the whole molecule would effectively accelerate the mutation rate of the coding region. Therefore as molecular clock we set a fixed rate of 1.57*10 -8 μ / site / year 30 calculated for the coding region of modern humans with ancient mtDNAs as calibration points. Additionally we used the eight radiocarbon Neanderthal dates (Supplementary Table S6) as time anchors on the Neanderthal branch. When we tried to estimate the mutation rate independently with these dates as tip calibrations, BEAST runs did not converge or provided a nonrealistic rate of one order of magnitude lower than observed for modern humans. Two possible reasons could explain these patterns. First, late Neanderthals ages are at the limit for the radiocarbon dating method, therefore minimal contaminations with modern collagen could result in considerably younger ages, as reported for El Sidron Neanderthal specimens 31 . Second, the eight radiocarbon dated Neanderthals have an average age in a range of only 3,000 years (mean values from ~41 to ~44 ka) (Supplementary Table S6). Therefore there might be not enough temporal depth to calibrate the molecular clock in a phylogenetic tree with divergence times in the order of several hundred thousands of years. For BEAST analyses we thus assumed that the mtDNA mutation rate of Neanderthals would be similar to the one of modern humans.

Supplementary Note 5: Likelihood of basal modern human-Neanderthal mtDNA admixture rate
We are interested in obtaining the probability that all Neanderthal mtDNAs are transferred from a branch that splits off basal from the modern human (BMH) lineage, assuming an instantaneous admixture event at some point in the past, before modern human populations started diverging into present-day groups. This is effectively a likelihood of our data given the rate of this admixture event. Let g(n,j,t) be the probability of there being j lineages at time t in the past, given that there were n lineages at time 0 in the present, measuring time in coalescent units 32 : Also, let b(j,n,p) be the binomial probability of j successes out of n trials, when p is the probability of success: At the time of admixture, we can treat each lineage transferred (backwards in the past) from BMH to the Neanderthal population as a success with probability equal to the admixture rate, assuming the admixture event was unidirectional. Then, for a given admixture rate r and a number y of Neanderthal mitochondrial lineages sampled in the present, the probability that all Neanderthal mitochondrial lineages originally came from the BMH population via this admixture event is: where the sum is over the possible number of Neanderthal lineages that may have existed at time t in the past.
The time t is measured in coalescent units. Assuming constant population size, this is equal to the time τ in generations scaled by twice the effective population size (N e ): For our analysis, we assume 29 years per generation, and set y = 18, as this is the number of Neanderthal mitochondrial genomes sampled so far. We begin by assuming the admixture event happened 300,000 years more anciently than the time of sampling. Assuming the sampling occurred 40,000 years ago (age of the latest Neanderthal mtDNAs), the event is therefore set to be more anciently than the deepest modern human population splits but more recently than the Neanderthal-Denisovan population split 25 . Using Mathematica 33 , we can plot the probability P[y,r,t] as a function of the admixture rate r, for various constant Neanderthal effective population sizes ( Supplementary Fig. S8).
We can see that, predictably, P[y,r,t] is an increasing function of the admixture rate r. For low Neanderthal N e , this probability is almost exactly equal to the admixture rate: when N e is low, by the time the admixture event happens (going backwards into the past) there is only a single lineage in the Neanderthal population with high probability. The probability that this single lineage enters the Neanderthal population is Bernoulli-distributed with parameter r. For larger Neanderthal N e , there is some non-trivial probability that two or more lineages make it all the way to the time in the past when the admixture event happened. This probability is smaller the larger the N e .
! In a diploid constant randomly-mating population with equal number of males and females and non-overlapping generations, the mitochondrial N e is expected to be a quarter of the autosomal N e 34 . Assuming the Neanderthal mitochondrial N e was a quarter of the estimated autosomal N e (2,000) 35 , this probability is well approximated by the blue line in Supplementary Figure 8. We can also use the compound (generation time * N e ) estimates from the Bayesian skyline method (Supplementary Fig. S7), assuming 29 years per generation, as a proxy for the history of mitochondrial population size changes in Neanderthals. To make computations feasible, we split this history into a piece-wise constant history with 3 periods. Going from the time of sampling into the past, these were set to be: Period 1: 10,000 years with 2N e = (40,000 / 29) Period 2: 20,000 years with 2N e = (20,000 / 29) Period 3: 270,000 years with 2N e = (200,000 / 29) Under these conditions, P[y,r,t] is almost a linear function of the admixture rate, suggesting a very high admixture rate may not be required for all descendant lineages to have originated from this event ( Supplementary Fig. S9).