Ancient genomes illuminate Eastern Arabian population history and adaptation against malaria

Summary The harsh climate of Arabia has posed challenges in generating ancient DNA from the region, hindering the direct examination of ancient genomes for understanding the demographic processes that shaped Arabian populations. In this study, we report whole-genome sequence data obtained from four Tylos-period individuals from Bahrain. Their genetic ancestry can be modeled as a mixture of sources from ancient Anatolia, Levant, and Iran/Caucasus, with variation between individuals suggesting population heterogeneity in Bahrain before the onset of Islam. We identify the G6PD Mediterranean mutation associated with malaria resistance in three out of four ancient Bahraini samples and estimate that it rose in frequency in Eastern Arabia from 5 to 6 kya onward, around the time agriculture appeared in the region. Our study characterizes the genetic composition of ancient Arabians, shedding light on the population history of Bahrain and demonstrating the feasibility of studies of ancient DNA in the region.


INTRODUCTION
Following flint and pottery evidence of human occupation from $5000 BCE, 1 from around 2200 BCE Bahrain can be connected with references in Mesopotamian cuneiform records to the Dilmun civilization, both as a trading center 2 and playing an important role in Sumerian mythology, including in their creation myth and in the Epic of Gilgamesh. 3,4This period also saw the start of a funerary tradition that ultimately resulted in the highest density of burial mounds in the ancient world (on the order of 100,000). 5he end of the Late Dilmun phase ($600 BCE) coincided with the Persian Achaemenid conquest of Mesopotamia, whose influence in Bahrain (600-300 BCE) is attested by the presence of bowls and other artifacts typical from that culture. 6round 325 BCE, an expedition to the Arabian coast sent by Alexander the Great reached the shores of Bahrain, which from that time onward would be known as Tylos.The Tylos period ($325 BCE to $600 CE) was characterized by exceptional pros-perity and marked by Hellenistic and Persian influence. 7The death of Alexander and subsequent disintegration of the Macedonian Empire led to the establishment of the Seleucid Empire (312-63 BCE), which dominated a vast area comprising Anatolia, Levant, Mesopotamia, and Iran, also controlling the eastern Arabian islands of Failaka and Bahrain.The Seleucids eventually lost authority in Mesopotamia, leading to the formation of the semi-independent Kingdom of Characene (141 BCE to 222 CE) in Southern Iraq, a vassal to the Parthian Empire that would govern Bahrain until Sasanian conquest. 8,9ncient DNA (aDNA) studies from the northern regions of the Middle East revealed that the first farmers from Anatolia, the Levant, and Iran each descended from local, genetically distinct hunter-gatherer populations, with variable amounts of gene flow from neighboring groups. 10,11For example, early Anatolians admixed with Mesopotamians and later with Levantine agriculturalists, 12 whereas in the Levant early farmers derived from Epipaleolithic Natufians and Anatolian Neolithic populations. 11uring the transition from the Chalcolithic to the Bronze Age, admixture between various Middle Eastern groups intensified, leading to increasing genetic homogenization in the region. 11round this time, an ancient Iranian-related component was introduced into the Levant, 13,14 followed by steppe/Europeanrelated ancestry in the Iron Age. 15,16In the neighboring region of Mesopotamia, the aDNA record is still sparse and therefore the genetic composition of local hunter-gatherers remains unknown, but recently published Pre-Pottery Neolithic (PPN) genomes from Upper Mesopotamia are genetically intermediate along the ancestry cline extending from ancient groups from Anatolia/Levant to Iran/Caucasus. 12,17ue to challenges associated with the recovery of ancient genomes from hot and humid climates, 18 no ancient genomes from the Arabian Peninsula have been published so far.Surveys of present-day genomes from Arabia and the Levant suggest that they were shaped by different demographic processes: first, Arabians carry an excess of East African-and Natufian-related ancestry in comparison with Levantine populations, who in contrast bear higher proportions of European and Anatolian Neolithic ancestry. 19,20Second, Arabian and Levantine populations present different size trajectories, with the former being affected by a pronounced bottleneck event occurring around 6,000 years ago (6 kya) that coincides with the ''Dark Millennium,'' a period of increasing aridification in Arabia, 21 and the latter showing a more recent size reduction associated with the $4.2 kya climate event 20 of wider distribution across Anatolia, the Levant, and Mesopotamia. 22Third, Arabians show increased levels of consanguinity in comparison to the Levant, 23,24 potentially leading to higher prevalence of genetic disorders including G6PD deficiency. 25o examine these topics directly using ancient genomes, we sequenced four individuals from the island of Bahrain dating from the Tylos period ($300 BCE to 600 CE).Using modeling approaches, we determine that their ancestry is a mixture of sources from ancient Levant, Iran/Caucasus, and Anatolia, with one individual presenting higher affinity to Levantine groups than the others, potentially as a result of historically recorded incursions of Arabian or Levantine tribes into Bahrain, 26 and two individuals with additional CHG (Caucasus hunter-gatherer)/post-Neolithic Iranian ancestry, which can tentatively be explained by Iranianassociated influence in Bahrain during pre-Islamic times.Additionally, we detect the G6PD Mediterranean variant in three Tylos-period Bahrainis and estimate that it rose to high frequencies in Eastern Arabia due to strong positive selection exerted by malaria endemicity coinciding with the appearance of agriculture.Lastly, we detect large runs of homozygosity (ROHs) in one indi-vidual, suggesting consanguineous unions in pre-Islamic Tylos populations.The present work provides a snapshot of the genetic composition of ancient Eastern Arabia and demonstrates the feasibility of aDNA studies in the region.

Ancient DNA sample sequencing and determination of authenticity
We extracted DNA from 25 skeletal samples from ancient burial mounds on the island of Bahrain ranging from the Dilmun period to the Tylos period and sequenced them to assess endogenous DNA preservation (Table S1).Of these, only four were sufficiently well preserved for additional sequencing, with the remaining samples presenting negligible amounts of human endogenous DNA.Here, we report shotgun sequence data obtained from these four samples, all derived from petrous bones: one from Abu Saiba (AS) and three from Madinat Hamad (MH1, MH2, MH3; Figure S1A; Tables 1 and S2).Of these samples, three were sequenced to approximately 13, and one to 0.243 (Table 1).Due to poor collagen preservation, we could only obtain radiocarbon dates for two out of the four sequenced individuals, placing them in the Late Tylos/Sasanian period (LT, $300-622 CE), with MH1 being older (432-561 cal.CE) than MH3 (577-647 cal.CE) (Figure S1B).Sample MH2 was not directly dated, but its archaeological context places it in the Late Tylos period.The Abu Saiba sample was excavated from a cemetery with known occupation between 200 BCE and 300 CE, 7,27 and therefore it dates confidently within the boundaries of the Early/Middle Tylos period (EMT), more precisely during the times of Seleucid and Characene influence in Bahrain, which preceded the emergence of the Sasanian Empire.
Deamination patterns in these sequences conform with those typical of aDNA (Figure S2), supporting the presence of authentic aDNA sequences, and we observed low contamination rates on the mitochondria of all four samples (%2%) and on the X chromosome of the two male samples (<3%, Table 1).None of the samples were close relatives of each other.

Admixed ancestry and subtle genetic differentiation within Tylos-period individuals from Bahrain
To examine the genetic affinities of the four Tylos-period samples from Bahrain, we performed a principal component analysis (PCA) 28,29 on 1,301 present-day West Eurasians, 30,31 including 117 recently reported Arabians and Levantines, 20,32 on which we projected 529 ancient individuals (Figure 1A; full PCA shown in Figure S3A).14]33,34 The Bahrain_Tylos individuals are slightly differentiated.Compared with the older individual AS_EMT, samples MH1_LT and MH2_LT are shifted toward eastern populations from Armenia ChL-IA composed mainly of CHG-and Anatolianrelated ancestry and variable amounts of steppe ancestry, with MH2_LT being further separated from the Bahrain group toward the direction of CHG and ancient groups from Iran and Pakistan.Sample MH3_LT is closer to ancient Levantine groups than the other three individuals from Bahrain.
To further investigate the ancestry composition of the Tylos individuals, we performed a temporally aware model-based clustering analysis on an expanded dataset using DyStruct (Figure 1B).At k = 9, Anatolian_N, Natufians, WHG (Western hunter-gatherers), and Iran_N/CHG define independent ancestral components that contribute to the majority of ancient and present-day West Eurasians (Figures 1B and S3B).Consistent with the PCA results, the four Tylos-period individuals are broadly similar to and intermediate between post-Neolithic groups from the Levant (BA/IA/Hellenistic/Roman) and Iran (ChL/IA), albeit with higher Natufian-related ancestry than ancient Iranians and higher amounts of Iran_N/CHG-related ancestry than ancient Levantines, a finding that is also corroborated by positive f4-statistics (Figures 2A and 2B, respectively).
We also observe some variability within the four Bahrain samples in the clustering analysis (Figure 1B), with MH3_LT carrying additional Anatolia_N and less Iran_N/CHG-related ancestry than the other samples.Here, Anatolian and Levantine ancestry were in some cases not well differentiated due to the presence of Anatolian ancestry in Neolithic Levantines.Through f4-statistics, we confirm that individual MH3_LT shares more alleles with both Anatolians and Levantines than the remaining Tylos-period samples and observe a gradient of Levantine-related ancestry which is maximized in MH3_LT and minimized in MH2_LT, with AS_EMT and MH1_LT presenting intermediate values between them (Table S3), as reflected in the PCA (Figure 1A).In Figure 1B, it is also apparent that MH1_LT and MH2_LT individuals present slight differences in comparison with the other two Bahraini samples, notably, higher Iran_N/CHG-related ancestry, and an additional small amount of Western/Eastern hunter-gatherer (WHG/ EHG)-related ancestry (approximately 3.1%-6.2%),which forms around half of the genetic composition of steppe EMBA, 35,36 and can also be found in Armenia_ChL and certain post-Neolithic Iranians (HajjiFiruz_BA/IA).
To formally investigate these subtle differences, we tested for excess allele sharing between Bahrain_Tylos and ancient groups X and Y by estimating f4(Mbuti, Bahrain_Tylos; X, Y) (Figure 2C).We corroborate excess affinity between MH3_LT and Levantines, especially with LBN_Canaanites, in comparison with CHG or post-Neolithic Iranians, but we obtain no significant results for the remaining samples from Bahrain.Conversely, MH1_LT and MH2_LT present excess allele sharing with CHG in comparison with ISR_Natufians, suggesting lower Levantine and higher CHG ancestry in these samples, with AS_EMT presenting intermediate values.MH1 and MH2_LT also present increased CHG and IRN_HajjiFiruz_IA ancestry in comparison with IRN_Ganj_Dareh_N, as well as slightly higher West Siberian hunter-gatherer (WSHG) ancestry, particularly in MH2_LT.
In summary, the four Bahrain Tylos individuals are genetically intermediate along the ancestry cline that extends from ancient Anatolia/Levant to ancient Iran/Caucasus and in proximity to ancient groups from Mesopotamia, Armenia and Azerbaijan, with subtle differences in ancestry composition.

A tripartite genetic ancestry of Bahrain Tylos
We found that Bahrain Tylos is best modeled as a mixture of sources from ancient Anatolia, Levant, and Iran/Caucasus.First, we obtained significantly negative results (z % À3) for the statistic f3(Bahrain_Tylos; X, Y) when X was an ancient Levantine or an Anatolian population (Natufians, Levantine, or Anatolian Neolithic) and Y an ancient group from Caucasus or Iran (CHG or IRN_Ganj_Dareh_N) (Table S4), suggesting that these ancestries have plausibly contributed, even if distantly so, to the formation of Bahrain Tylos.
Second, nearly all the two-way qpAdm models that fit the data involve populations carrying three types of ancestry: Levantine, Anatolian, and CHG/Iranian Neolithic (Table S5).For instance, all four samples can be modeled with mixtures of Levant_PPN and CHG (p > 0.05; Figure 3A and Table S5).However, when replacing Levant_PPN with ISR_Natufian_EpiP, only two samples (AS_EMT and MH3_LT) were successfully modeled, although with lower p values, suggesting that an additional Anatolianrelated component, which is present in Neolithic Levant, but absent in Natufians and CHG, is required to model the remaining individuals.We note that Anatolian Neolithic ancestry can also derive from other groups, such as ARM_Aknashen_N (formed of Anatolian and CHG ancestry), which returns feasible models when combined with ISR_Natufian_EpiP or Levant_PPN (p > 0.05; Table S5).
Models including Levant_PPN and IRN_GanjDareh_N sources are not as successful, given that only the lower-coverage MH3_LT individual can be modeled in this way (p = 0.042; Figure 3A and Table S6), suggesting that CHGs might be better representatives than Neolithic Iranians for the eastern ancestry present in the Tylos samples.Accordingly, we obtain a significantly negative (Z = À3.81)f4-statistic f4(Mbuti.DG, Bahrain_Tylos; CHG, IRN_Ganj_Dareh_N), suggesting excess affinity between Bahrain_Tylos and CHG in relation to IRN_Ganj_Dareh_N.This is despite the fact that, as seen in our DyStruct analysis (k = 12, Figure S3B), CHG and IRN_Ganj_Dareh_N ancestries are difficult to distinguish from one another.It is possible that they both contributed to the formation of Tylos-period Bahrainis.We also obtained generally less confident but feasible models when combining Iranian Neolithic or Late Neolithic with CYP_PPNB or TUR_C_C ¸atalho ¨y€ uk_N, with the latter two populations containing both Anatolian and Levantine ancestry (Figure 3A and Table S5).
Third, we also find support for a tripartite genetic ancestry of Bahrain Tylos when including more temporally proximal groups as sources in our models (Figure 3B; Tables S5 and S6), obtaining feasible combinations of older Levantine and more recent Iranians and vice versa.The overall pattern in these models is that the most recent source contributes with a greater ancestry proportion to Bahrain Tylos samples, irrespective of being Iranian or Levantine.We interpret this as being due to the fact that later samples tend to carry the Anatolia/Levant/Iran/ CHG-related ancestries required for modeling the four samples from Bahrain.
Lastly, we evaluate the ancestry of Bahrain Tylos individuals in a context of Near Eastern variation by estimating ancestry proportions using a previously published 12 three-way model with TUR_Pınarbas xı_EpiP, ISR_Natufian_EpiP, and CHG as sources in a set of relevant ancient groups which can also be modeled in this way (Figure 3C). 12In this model, the Bahrain_Tylos samples present ancestry proportions similar  See also Table S3.
to those of various ancient samples from Mesopotamia, the Caucasus, and post-Neolithic Iranians and Levantines.Accordingly, rank-0 qpAdm models show that Tylos-period Bahrainis form a clade with several of these ancient groups (p R 0.01; Table S7), suggesting that similar sources have contributed to their ancestry.According to our analyses, Tylos-period samples from Bahrain can be modeled using sources related to ancient groups from Anatolia, Levant, and Iran or Caucasus and are broadly similar in ancestry to various groups from neighboring regions.

Recent gene-flow events shaped the genetic composition of Tylos-period Bahrainis
We next investigated the timing of admixture between ancient Levantine and Iranian sources in Bahrain Tylos samples using DATES (Figure 3D).We observe different admixture times for AS_EMT and MH1-2_LT: the first occurred 14 ± 1.7 generations before AS_EMT (357-455 years, assuming a generation time of 29 years), coinciding with a period of Achaemenid influence in Bahrain, and the second occurred 16 ± 7 generations (261-667 years) before MH1_LT and MH2_LT, a period defined by the While recognizing that there are limits to fitting complex models of population history, 37 these processes are illustrated well by an admixture graph that was semi-automatically generated to fit the data (Figure 3E).In this graph the earliest sample from Abu Saiba (AS_EMT) derives from admixture between sources related to the Levant and Iran_N/CHG.The three Late Tylos individuals descend from admixture between the earliest sample AS_EMT and other sources, with MH1_LT and MH2_LT having an additional pulse of post-Neolithic Iranian-related ancestry, here represented by IRN_Hajji_Firuz_IA, whereas the MH3_LT sample received additional ancestry associated with the Levant, represented by LBN_Canaanite from Sidon.These inferences are supported by the observation of additional IRN_IA ancestry in MH1_LT and MH2_LT relative to AS_EMT and MH3_LT, and that MH3_LT bears more Levantine ancestry than the other samples from Bahrain (Figure 3B).We show an alternative model where the post-Neolithic Iranian source is instead represented by IRN_Shahr-i-Sokhta in Figure S4.We note that the populations depicted in the graph act as representatives of different ancestries and do not necessarily correspond to the actual populations that contributed to the formation of Tylosperiod Bahrainis.Further, the true population history is likely to be different in detail from, and more complex than, the admixture processes represented here.

Affinities with Mesopotamia
Considering the abundance of evidence of contacts between ancient Bahrain and Mesopotamia, we evaluated separate models using Mesopotamia_PPN (two PPN samples from Northern Iraq and one from Mardin in Turkey), an LBA sample from Iraq, and the Upper Mesopotamian Cayonu_PPN from Southeastern Turkey as a source.First, we observe that all four Tylos samples form a clade with IRQ_Nemrik9_LBA, two samples form a clade with Mesopotamia_PPN (MH1_LT and MH3_LT), and only the lower-coverage MH3_LT forms a clade with Cayonu_PPN (Table S7).When exploring two-source models with Mesopotamia_PPN, we can model all samples with Levant_PPN as a second source, except for sample MH2_LT, which requires additional EHG ancestry, or using RUS_AfontovaGora3, SRB_Iron_Gates_HG, or LBN_IA as a second source instead of Levant_PPN (Table S8).While the inability to reject various models of Bahrain Tylos ancestry containing ancient Mesopotamians as a source suggests a relationship between the two, the low Z scores of admixture weights attributed to non-Mesopotamian sources do not allow establishing the precise role of Mesopotamian groups in the formation of Bahrain Tylos.

Tylos-period Bahrainis are genetically closer to presentday populations from Iraq and the Levant than to present-day Arabians
Regarding affinities with present-day populations, the temporally aware model-based clustering analysis (Figure 1B) suggests that Tylos-period Bahrain samples are more similar to present-day Levantine groups than to present-day Arabians or South Asians, who show higher and lower amounts of Natufian ancestry, respectively.To investigate this in more detail, we performed a haplotype-based ChromoPainter and FineSTRUCTURE analysis with a dataset including Tylos-period Bahrainis and ancient samples from the Levant 13,38 and Iran 10 with available whole-genome shotgun sequence data, as well as present-day individuals from the Human Origins dataset 11,39 and from Arabia and the Levant 20 (Figure 4).
Three Bahrain Tylos samples (AS_EMT, MH1_LT, and MH2_LT) were included in a cluster composed of Syrians, Iraqis, and two Jewish individuals from Iran and Georgia (Figure 4A), whereas sample MH3_LT clustered with 22 ancient individuals from Lebanon ranging from the Bronze Age to the Roman period.In a PCA of the haplotype sharing matrix (Figure 4B), we observe that MH3_LT is closer to ancient Lebanese and Sardinians, corroborating our previous inferences of increased ancient Levantine and Anatolian Neolithic ancestry in this sample.The remaining three Bahrain Tylos individuals are positioned closer to present-day groups from the Levant, Iraq, and the Caucasus, including several Jewish populations from these regions, and between ancient Lebanese and an Iranian IA individual.
To gain further insights into the relationship of Tylos to present-day populations from the Arabian Peninsula and the Levant, we tested whether the Bahraini samples form a clade with any modern population in our dataset using a set of reference populations (as outgroups) that can differentiate the different ancestries in the Near East.We found that Iraqis, Assyrians, and Jewish groups from Iran, Georgia, and Iraq could derive all their ancestries from Tylos-period Bahrainis (Table S10).Arabians such as Saudis, Emiratis, and Yemenis have, in addition to ancestry from Tylos-period Bahrain, ancestry from East Africa, while Levantines such as Druze and Lebanese have additional Southeast European ancestry (Table S11).Here we should note that modeling present-day populations with Tylos-period Bahrain does not imply a direct contribution but rather that they are suitable representatives of the ancestry found in present-day populations.

Diverse uniparental lineages link ancient Bahrainis to Middle Eastern populations
We used pathPhynder 40 to examine the Y chromosome lineages of the two male individuals from Late Tylos-period Bahrain in a context of present-day and ancient samples (Figure S5 and Table S12).Individual MH3_LT carried the H2 haplogroup, which is associated with the spread of Near Eastern and Anatolian farmers into Europe. 41MH3_LT was placed in the same clade as a PPNB Jordanian and two Anatolian samples (Alalakh_MLBA and Ilipinar_ChL), which is consistent with the genetic affinities of this individual (Figure S5A).
Sample MH1_LT, the only other male sample in our dataset, presented the J2a2a1a$ lineage, which in a phylogeny with 2,014 individuals is carried by a present-day Brahmin and a Uighur individual.In the aDNA record, this lineage and derived haplotypes have been identified in two Turkish individuals (Gordion_Anc and Medieval), in a Canaanite, and in an Iron Age Hasanlu individual, 12,42 and in various present-day Central Asian samples from Turkmenistan and Kazakhstan.An eastern origin (Iran-Caucasus region) for haplogroup J2 is likely, given that the earliest occurrence of this lineage in the aDNA record is in hunter-gatherers from Caucasus and Iran, 11,43 with the latter being placed at the base of the J2a clade (Figure S5B).MH1_LT's inclusion in this Y chromosome clade is consistent with the subtle excess in shared autosomal ancestry with CHG/IRN_HajjiFiruz_IA individuals in this sample (Figure 2C).
Strikingly, neither of these lineages was found in present-day Arabians. 20Their presence in both western (Anatolia/Levant) and eastern (Iran/Caucasus/Central Asia) ancient samples reflects connectivity between these ancient civilizations as early as the PPN (as attested by the presence of eastern ancestry and J2 Y chromosome lineage in Cayonu_PPN 44 ) which apparently intensified during post-Neolithic times. 42he mitochondrial lineages of the Tylos-period samples suggest maternal ancestry sharing with various groups from the Near East, Caucasus, and South Asia.The earliest sample AS_EMT carries the mtDNA J1c15a1 haplogroup previously reported in two present-day samples from Iraq 45 and Azerbaijan, 46 groups which typically carry Iranian-and CHG-related ancestry.The Late Tylos samples from Madinat Hamad presented distinct lineages, with MH1_LT belonging to the R2 haplogroup, which is predominantly South Asian (southern Pakistan and India), but also distributed in the Middle East, Caucasus, and Central Asia. 47In the aDNA record this lineage is most frequent in Iranian groups, including three Neolithic and two BIA/LBA individuals, 42,48 which is consistent with the additional post-Neolithic Iranian-related ancestry in this sample.Interestingly, its presence in a Bronze Age Canaanite, coinciding with the emergence of Iranian ancestry in the Levant, provides additional support for the association of R2 lineages with this source of ancestry.Sample MH2_LT carries the T2b lineage, of widespread distribution in European samples.Lastly, MH3_LT presents the U8b1a2a hap-logroup, also found in two ancient LBA Armenians, two Turkish individuals (one ChL and the other dated to 750-480 BCE), 42 and a present-day Jordanian individual, 49 suggesting distribution in the Caucasus, Levant, and Anatolia and consistency with the increased Levantine and Anatolian affinities of this sample.These observations support extensive female as well as male migration in ancient West Asia, including Eastern Arabia.

Phenotypic prediction
We used HIrisPlex-S 50 to predict hair, skin, and eye color phenotypes of the Bahrain_Tylos samples.All four samples were predicted to have brown eyes (>99% probability).For hair color, the prediction was either brown ($50%) or black ($50%).Two samples (MH2_LT and MH3_LT) were predicted to have ''dark'' skin pigmentation (>90%), whereas the results for the remaining two samples were less certain, with AS_EMT potentially having relatively lighter skin pigmentation (Table S13).The predicted phenotypes found in ancient Bahrain are similar to those in present-day Middle Easterners and South Asians.Within these regions, subtle geographical trends occur at the level of skin pigmentation.Specifically, non-Bedouin Levantine groups from the Human Genome Diversity Project (HGDP) tend to have higher proportions of ''intermediate'' skin pigmentation, while Pakistanis tend to have higher proportions of ''dark'' skin than most Levantine populations, especially in the south. 50

Runs of homozygosity suggest a spectrum of inbreeding similar to present-day Middle Easterners
We also examined ROHs in the Tylos samples and compared them with present-day and ancient genomes (Figure 5A).Contemporary Middle Eastern populations show long tracts of ROHs indicative of recent consanguinity, while ancient regional hunter-gatherer groups are known to have relatively large numbers of shorter ROHs, reflecting their smaller population size. 51We observed an excess of ROH size in sample MH2_LT which is similar to that seen in present-day Middle Eastern and South/Central Asian populations (Figure 5A).In comparison to ancient groups, we find that three individuals (AS_EMT, MH1_LT, and MH3_LT), show an ROH distribution similar to those of Bronze and Iron Age regional populations (Figure S6A).However, one sample (MH2_LT) appears to have larger ROHs, with one longer than 16 Mb, suggesting direct parental relatedness (Figure S6B).
To confirm this result, we then used an alternative method 52 for identifying ROHs in ancient samples.We observe that MH2_LT has levels of inbreeding consistent with being the offspring of related individuals (potentially second cousins; Figures 5B and S6C), including a stretch on chr7 that spans 20.47 cM (Figure S6D).This finding is evidence that consanguineous union was likely to have already been practiced in pre-Islamic Arabian societies.
High prevalence of the malaria-protective G6PD Mediterranean mutation in ancient Eastern Arabia coincided with the introduction of agriculture G6PD deficiency is the most common enzymatic defect in humans, 53 and its distribution in worldwide populations correlates with regions currently or historically affected by malaria, including Africa, the Mediterranean, the Middle East, and Southeast Asia, 54 leading to the proposal of a link between G6PD deficiency and malaria protection. 55onsidering that malaria was endemic in Bahrain during historical times, 56 with osteological analyses suggesting that it was present in the island at least by the Tylos period, 57 we examined the Tylos-period Bahraini samples for the presence of mutations putatively associated with malaria protection.Two ancient samples from Bahrain carried the G6PD Mediterranean mutation (rs5030868; G>A; p.S188F).MH2_LT is potentially het-erozygous, given that we observed two reads overlapping this SNP in this sample, one supporting the derived allele A and the other with the ancestral allele G, whereas EMT presented two reads with the derived allele.
To corroborate our findings and obtain additional insights about the distribution of this variant in ancient populations, we imputed the X chromosome of the Bahrain_Tylos samples together with 37 ancient Lebanese and four ancient Iranians.We confirm the presence of derived alleles (genotype probability >0.95) at the rs5030868 locus in samples AS_EMT (homozygous), MH2_LT (heterozygous), and MH3_LT (male; hemizygous mutant).None of the ancient Lebanese we imputed carried this mutation and neither did five individuals from Bronze Age Greece with sequencing reads spanning this site, 58 but we observed a heterozygous genotype in a Western Iranian Neolithic sample (AH1, genotype probability >0.9).When estimating a phylogeny with the imputed X chromosome haplotypes containing the mutation of interest, we observe the inclusion of three Tylos-period Bahrainis in a clade with additional samples carrying the derived allele, including presentday Emiratis, Omanis, and Yemenis, as well as the Iranian Neolithic sample AH1 (Figure 6A).
In present-day populations, we observe relatively high frequencies of this variant in Makrani (0.19), Brahui (0.14), and Pathan (0.08) population samples, all of which bear high levels of Iranian Neolithic-related ancestry.In Europe frequencies are substantially lower, with a single Sardinian sample carrying the derived allele in the HGDP dataset (0.03), and 3 out of 53,000 European individuals in the gnomAD database. 59In the Middle East, it can be found in Palestinians (0.06) but not in Druze, Bedouins, Jordanians, or Iraqis. 20In Arabian populations, the derived allele reaches its highest frequencies in EmiratiC (0.38), a subgroup of the Emirati who carry substantial Iranian/South Asian-related ancestry (Figure 1B), and 6% in the rest of the Emirati population.It has also been found in the present-day populations of Yemen (0.04) and Qatar (0.03), 32 but not in Saudis. 20In the Levant and Eastern Arabia, the Mediterranean mutation is responsible for the majority of G6PD-deficient cases, reaching very high conditional frequencies in Bahrainis (0.91), 60 Northern Iraqis (0.88), Kuwaitis (0.74), and Jordanians (0.54). 61o search for signatures of selection associated with the G6PD Mediterranean, we performed an extended haplotype homozygosity (EHH) analysis on the haplotypes surrounding this SNP in a dataset of present-day Arabian and Levantine groups.We observe that the haplotypes carrying the derived allele tend to be longer than those with the ancestral allele (Figure 6B), consistent with a signature of positive selection.We subsequently modeled the historical change in frequency of the variant and found that it has increased rapidly in the past 6,000 years in present-day Emiratis (EmiratiC) with an estimated selection coefficient of 0.013, suggesting strong selective pressure (Figure 6C).This date broadly coincides with the start of the Bronze Age in Eastern Arabia ($3200 BCE), a period of cultural transformation marked by the shift from nomadic pastoralism to agriculture. 62

DISCUSSION
Population heterogeneity and admixture in Tylos-period Bahrain In the present work, we address an important gap in the aDNA record by presenting whole-genome sequences from Eastern Arabia, more precisely from Tylos-period Bahrain, an era of Hellenistic, Parthian, and Sassanian influence in the region.We observe that the four individuals from the Tylos period occupy an intermediate position along the genetic ancestry cline spanning from western (Anatolian/Levantine) to eastern (Iran/CHG) sources, with some degree of heterogeneity in ancestry composition within our samples.
We detected an excess of Levantine-related ancestry in sample MH3_LT, which raises the possibility of admixture with populations from the Levant or as yet unsampled Arabian groups that may have carried this ancestry into Bahrain before or around the onset of Islam.The first hypothesis is supported by the substantial affinity of MH3_LT to Lebanese Canaanites and other Levantine groups and by shared Y chromosome and mitochondrial DNA lineages between this individual and ancient eastern Anatolians and Levantines.The second hypothesis is supported by the existence of historical records attesting to the presence of several nomadic Arab tribes, such as the Abd Al-Qays and Al-Adz, under Nasrid control 63 in Bahrain during this time.These tribes migrated from Southwestern Arabia into Bahrain and neighboring regions in pre-Islamic times, subsequently participating in the conquests of Persia and Mesopotamia in the second half of the 7th century CE. 26 Such migrations could have brought additional Levantine ancestry to Bahrain, given that Southwestern Arabians, such as the Yemeni from Maarib, carry substantial Natufian-related ancestry but virtually no African ancestry, 19 as is the case of sample MH3_LT, and are, therefore, potentially good representatives of Arabian ancestry prior to admixture with other sources including Africans.
The presence of minor steppe-related ancestry in the Late Tylos samples (MH1_LT and MH2_LT), but not in the earliest sample from Abu Saiba (AS_EMT), does not require a direct influx from steppe groups, associated, for example, with the LBA migrations that spread Indo-European languages into South Asia. 33,64Instead, this ancestry may have been introduced to Bahrain through admixture with geographically and temporally more proximal groups from ancient Levant or Iran where it appears around the Iron Age 15,38 and Bronze Age, 33 respectively.Considering that Bahrain history is marked by Achaemenid, Parthian, and Sasanian influence, it is plausible that groups associated with these empires acted as vectors for introducing steppe ancestry into Bahrain.In support of this hypothesis, the J2a2a1 Y chromosome lineage of sample MH1_LT is also present in the Iranian IA and in other ancient groups with substantial Iranian-related ancestry (Central Asian, Anatolian, Levantine BA), but neither of the two males reported here carried the R1a-Z93 lineage associated with the spread of steppe ancestry and Indo-European languages into South Asia. 33e inferred admixture times between Levantine and Iranian sources occurring first, during the Achaemenid Empire, and second, during a time of Characene rule of Bahrain, suggesting that political and cultural changes were accompanied by gene flow.However, this analysis suffers from several limitations.First, it is possible that this finding may derive from a lack of power to determine exact admixture times in the case of continuous gene flow.Second, the lack of direct radiocarbon dates for two of our samples, AS_EMT and MH2_LT, limits precise estimation of admixture timing.
Given the limited sample size (n = 4) and the narrow temporal focus on a specific period in Bahrain's history, it is important to emphasize that it remains challenging to establish whether these ancestry differences derive from specific population movements occurring alongside political changes in the Near East after the Dilmun period or if they reflect pre-existing genetic diversity within the island's population.As seen in our PCA (Figure 1A), Near Eastern groups such as Iron Age Levantines and Iranians present relatively high diversity and, therefore, it is feasible that the differences we observe between our samples derive from pre-existing genetic structure within the Bahraini population not directly related to the events mentioned here.Furthermore, while we were able to identify several plausible models for describing the ancestry of Tylos-period Bahrainis, the current lack of ancient genomes from the Arabian Peninsula and sparse sampling from Mesopotamia (particularly from the south, from where no aDNA samples have been published so far) prevents us from testing models with more proximal sources relevant to Bahrain.Nevertheless, various analyses point to some degree of genetic similarity between a Late Bronze Age sample (IRQ_Nemrik9_LBA) and the Bahraini samples, which gathers support from archaeological and textual evidence of contacts between Mesopotamia and Bahrain at the time of the Dilmun civilization.However, as this observation derives from a single Mesopotamian sample, this connection should be re-examined when additional ancient genomes become available.A particularly important question remains regarding the ancestry of the Dilmun-period inhabitants, which, once characterized, will help describe more precisely how the genetic composition of Arabians has evolved through time, the relationship of Dilmun with neighboring civilizations, and their contribution to later Tylosperiod groups.

Increased malaria endemicity in Eastern Arabia
We report the presence of the G6PD Mediterranean mutation in three out of four samples and four out of six alleles from ancient Bahrain.Notably, this includes a homozygous female and a hemizygous male, who would have suffered G6PD deficiency and, possibly, hemolytic anemia.This finding suggests that this mutation occurred at an appreciable frequency in Eastern Arabia during the Tylos period.The identification of this variant in an Iranian Neolithic sample, but not in ancient Levant or ancient Europe, together with its prevalence in present-day groups from Pakistan and Arabian groups with high Iran N-related ancestry (EmiratiC), challenges the hypothesis that it was introduced into the Middle East in relatively recent times alongside the Greek expansions of the first millennium BCE 65 or that its dispersal into Europe occurred alongside Neolithic migrations. 66Our data are consistent with an alternative history in which this mutation was disseminated into Arabia through admixture with Mesopotamian/Iranian/South Asian groups, rising to high frequency due to selective pressures, particularly along the Arabian coast, where malaria incidence was especially high.According to our estimates, the onset of selection for this variant in Eastern Arabia occurred between 5 and 6 kya, broadly coinciding with the emergence of agriculture in the region.Such changes are evidenced by the presence of domesticated cereals at Al-Hili dated to around 3000 BCE 62 and the development of oasis agriculture associated with the Umm an-Nar culture (2700-2000 BCE). 67Increased sedentarism in areas with available water 68 would have created a propitious environment for the proliferation of malaria, causing a rise in frequency of malaria-protective variants.
Our confidence in imputing the Mediterranean variant in the three samples that carried the derived allele is supported by direct observation of the variant in a total of three reads spanning that position in two individuals.Although this SNP is a G/A change, potentially causing difficulties in distinguishing it from postmortem deamination, this is likely to be more problematic for the Iranian Neolithic sample than for our uracil-DNA-glycosylase-treated samples.Nevertheless, this Iranian individual shares the same haplotype as the other derived samples, providing additional confidence in it being a carrier.Expanding our investigation of the Mediterranean variant to a wider set of ancient groups would be desirable but currently challenging because this SNP is not included as part of the 1,240k SNP target capture array commonly used for sequencing ancient samples. 69Future studies generating additional whole ancient genomes from the Mediterranean region, the Middle East, and South Asia should provide more insights about the spread and distribution of malaria-protective variants in the ancient world.
Lastly, we demonstrate the feasibility of aDNA studies in Arabia, paving the way for future research aimed at elucidating human population movements in the region.The sequence data and insights reported here will be of long-term importance for studying human population history in the Middle East and beyond.

Limitations of the study
One of the main limitations of our study is the small sample size, comprising only four individuals.Due to the hot climate of Eastern Arabia, DNA preservation in skeletal material tends to be poor, preventing us from obtaining a larger number of samples with sufficient endogenous content for analysis.Furthermore, the four individuals here reported belong exclusively to the Tylos period, creating challenges in determining exactly how genetic composition changed in Bahrain from the Dilmun period to the present day.Due to poor collagen preservation, we were also unable to obtain radiocarbon dates for two samples, introducing uncertainties at the level of temporal placement and estimation of admixture times.Additionally, we currently lack aDNA from mainland Arabia and Southern Mesopotamia, which could bring additional clarity to our current ancestry models.Future studies, possibly equipped with novel methodological developments, should be directed toward retrieving a larger number of samples from Arabia and Mesopotamia and covering a larger temporal range to provide further insights about the human population history of the region.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

Article Sex determination and relatedness analysis
We determined the sex of the four ancient samples from Bahrain using a previously published method 74 (Table 1).We performed a kinship analysis on the four ancient samples from Bahrain using READ (https://bitbucket.org/tguenther/read/) 75with default parameters on approximately 1,240k SNPs of the Allen Ancient DNA Resource, genotyped as described below, which did not reveal any close kinship relationships between them.
_ Ir a n ia n Ir a n ia n _ B a n d a ri M a k ra n i Ir a n ia n B ra h u i B a lo c h i S in d h i H a z a ra P u n ja b i a h L e b a n e s e _ C L e b a n e s e _ M D ru z e A s s y ri a n Ir a q i_ A ra b S y ri a n J e w _ Ir a q i Ir a q i_ K u rd is next page) intermediately along the cline connecting western (Anatolian and Levantine) and eastern (CHG and IRN_Ganj_Dareh_N) sources and in the vicinity of Upper Mesopotamian PPN individuals from Southeastern Turkey and Northern Iraq (Mesopotamia_PPN) and Neolithic (N) farmers from Armenia (ARM_Masis_Blur_N and ARM_Aknashen_N) and Azerbaijan (AZE_N).Also in the proximity of the ancient Bahrain samples are various post-Neolithic groups (Chalcolithic [ChL], Iron Age [IA], Bronze Iron Age [BIA]) from Iran (IRN_HajjiFiruz_ChL/ IRN_Hasanlu_IA, IRN_DinkhaTepe_A_BIA), Bronze Age (Early [E], Middle [M], Late [L] Bronze Age [BA]

Figure 3 .
Figure 3. Models of Bahrain Tylos ancestry (A and B) (A) Selected rank-1 qpAdm models of the four Bahrain Tylos individuals including distal sources and (B) a mixture of distal and proximal sources.Feasible rank-0 models are shown instead in the cases where the rank-1 model was rejected.Single asterisks (*) indicate 0.05 > p R 0.01; double asterisks (**) and white bars indicate rejected models at p < 0.01.(C) qpAdm models of ancient Near Easterners using Natufians, Pinarbasi, and CHG as sources.Horizontal bars show standard errors.(D) Inferred time of admixture in Bahrain Tylos using DATES ancestry covariance decay curve with LBN_Canaanites and IRN_HajjiFiruz_IA as references representing ancient Levantine and ancient Iranian ancestries, respectively.Vertical bars show standard errors.(E) Modeling Bahrain Tylos ancestry using qpGraph.See also Figure S4 and Tables S4, S5, S6, S7, S8, and S9.

Figure 4 .
Figure 4. Haplotype-based population affinities (A) FineSTRUCTURE phylogeny and clustering based on haplotype sharing patterns from present-day Eurasians and ancient Middle Easterners.(B) PC analysis estimated from the ChromoPainter haplotype sharing matrix.See also Figure S5 and Tables S10, S11, S12, and S13.

Figure 5 .
Figure 5. Runs of homozygosity (A) Runs of homozygosity (ROHs) in Bahrain_Tylos and worldwide populations (plotting median values for each population).(B) ROH length (>4 cM) distribution in sample MH2_LT and expected ROH distribution for the offspring of different consanguineous unions.See also Figure S6.

Figure 6 .
Figure 6.Malaria adaptation in Eastern Arabia (A) Phylogeny of present-day and ancient haplotypes from Middle Easterners surrounding the G6PD Mediterranean mutation (rs5030868).(B) Extended haplotype homozygosity (EHH) in present-day Arabians.(C) Allele frequency trajectory of the variant in present-day Emiratis (EmiratiC) highlighting changes in modes of subsistence in Eastern Arabia.See also Figure S7.

Table 1 .
Ancient DNA samples from Bahrain sequenced in this study

TABLE d
RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d EXPERIMENTAL MODEL AND SUBJECT DETAILS d METHOD DETAILS B Archaeological sample processing, DNA extraction, library preparation and sequencing B Radiocarbon dating B Sequence read processing and alignment B Sex determination and relatedness analysis B Authenticity of aDNA sequences and contamination estimates B Population genetics analyses e1 Cell Genomics 4, 100507,March 13, 2024