Provenance studies using lead isotopy: contribution of the consideration of geological contexts in archaeological databases

– The identi ﬁ cation of mineral supply sources and trade routes are at the heart of the archaeologicalissues.Thetracingofsourcesofmetalproduction via leadisotopyhasbeenusedsincethe1980s to identify the deposits from which the metal constituting the archaeological objects came. Such studies are basedonmineralsignaturerepositoriesandarchaeologistshavethusbuiltupdatabasescontainingthousandsoforedepositanalyses.Thedatabases,however,onlyveryrarelyincludegeologicalinformationandarelimitedtogeographicinformation.Butconsideringonlygeographicaldataleadstomanylimitationsofstudies,includingtheoverlappingofsignaturesbetweenremoteregions.Thisproblemcouldneverthelessbecircumventedbytakingintoaccountpreciseoredepositdatathatenablestothinkintermsofrestrictedmineralizedsubsets.WeillustratethisthroughtheexampleofdatacollectedintheAlpsbyMarcoux(1986)andNimis et al. (2012). Taking into account geological data (and more speci ﬁ cally, gitological data) could thus considerably improve theaccuracyofprovenanceinterpretationsandenablemultivariatestatisticalprocessingtobecarriedout(thesestatisticaltreatmentsareinconclusiveiftheyarecarriedoutonpurelygeographicaldata).However,remainstheproblemofthethousandsoforeanalysescarriedoutwithouthavingde ﬁ ned their mineralization context. Although still imperfect (some contexts are not individualized), the use of a statistical treatment could nevertheless be used to identify the gitological contexts. Résumé – Traçage de provenances par isotopie du plomb: apport de la prise en compte des contextes gîtologiques dans les bases de données archéologiques. L ’ identi ﬁ cation des sources d ’ approvisionnement en minerais et des circuits d ’ échanges est au c œ ur des problématiques archéologiques. Le traçage de sources de productions métalliques via l ’ isotopie du plomb est ainsi utilisé depuis les années 1980 pour identi ﬁ er les gisements dont est issu le métal composant les objets archéologiques. De telles études s ’ sur de signatures de les ont ainsi constitué des bases de données, riches de milliers d ’ de gisements. Cependant, ces de données n ’ intègrent que ’ ordre ne que à dont l ’ de signatures des Ce être contourné en prenant en compte des de minéralisés


Introduction
Lead has four stable isotopes: the isotope 204 Pb witch is primitive; 206 Pb,207 Pb and 208 Pb witch are radiogenic 1 . These radiogenic isotopes are the final forms of radioactive decay of Uranium and Thorium: 235 U ⇒ 207 Pb; 238 U ⇒ 206 Pb; 232 Th ⇒ 208 Pb. However, metallic copper, silver artifacts, but also pigments, contain lead in very small quantities. This presence of lead in the form of traces comes from the ore that was used to shape the object and its (lead) isotopic signature can be researched. As there is no lead fractionation during the metallurgical process (Cui and Wu, 2011;Pernicka, 2014), the isotopic lead signature of an artifact is comparable to that of the ore that was used to shape it, and can allow to identify the ore deposit used to craft the archeological object regardless of the period of manufacture 2 . Lead isotopic studies have so been used since the late 1980s to identify the source of the metal used to produce an artifact found in an archaeological context. This type of study is commonly referred to as provenance studies and lead isotopic analyses are an important part of the analyses carried out in the archaeological studies.
The importance of tracing provenance is such that thousands of mineral analyses have been carried out as part of archaeological studies, including the Oxalid 3 database (the database includes analyses performed on protohistoric artefacts and minerals from the Mediterranean, British Isles and Balkans) or the database compiled by B. Scaife 4 (including analyses of ores from Egypt, Levant, Maghreb). Nevertheless, although archaeologists recognise that the isotopic signature of lead is specific to each mineralization (Stos-Gale, 1992), they do not take into account the geological contexts in their provenance studies. Indeed, as already pointed out by Ixer (1999), Guénette-Beck and Serneels (2010) and Baron et al. (2014), archaeological databases contain only geographical information and lack gitological 5 information. The nature of the minerals analyzed is not always indicated and the complexity of the deposits from which they are derived is generally poorly known. In addition, ores are classified by large geographical areas in which mineralization of very different ages and types can be found. Finally, some samples sometimes even not correspond to ore deposit: they are host rocks completely disconnected from artifacts (chromites in Oxalid data base for example).
In addition, provenance studies generally do not take into account data collected by geologists. Used as metallogenic tracers 6 , the "geological" data are often considered to be insufficiently sampled to be used to answer the question of the origin of archaeological artefacts. Indeed, geologists carry few measurements per mine but perform measurements on several deposits affected by the same mineralizing episode. On the contrary, archaeologists favour numerous measurements per mine in order to obtain the most complete possible spectrum of the signature of the mine being investigated.
However, could lead isotopic measurements from geological research be integrated into archaeological provenance studies? What could be the contribution of taking into account geological contexts (and not just a geographical origin) for provenance studies? Could it improve the tracing of production sources?
2 Archaeological provenance studies The study of provenances via lead isotopy is a complex operation, requiring many factors to be taken into account.
First of all, defining the geographical origin of the metals making up an artifact requires knowing the chemical composition of the artifacts. The origin of an object made of a copper alloy can only be traced through the use of lead isotopes if the quantity of lead in the artefact is known. While the thresholds affecting signatures are as yet poorly known, it is nevertheless clear that if lead is added as an alloying element, the signature of the lead deposit blurs that of the copper deposit.
It is also important to consider the period in which the artifact was made, the sources known to have been used at the time, and the extent of trade networks, as these are valuable clues for deciding between several possible sources of production (the artifact typology can also be used to obtain valuable information about the area of origin of the objects).
However, even with these precautions, the use of lead isotopes as the only source tracer is regularly criticized (Baron and Coustures, 2018) because of two main factors: sources mixing and statistical overlaps.
(i) Possible mixtures of sources and recycling may occur during the manufacture of the objects. The mixtures can be of different types: -Mixtures of metals from different deposits can be made, for example, by mixing ores from different sites to obtain an alloy with specific properties; -Mixture signatures can also be the result of recycling: a metallic object (broken for example) can be remelted.
Metallurgists can melt copper from a broken object and mix it with copper from a different origin to create a new object with a signature corresponding to the average of the two deposits 7 (Longman et al., 2018).
(ii) Problems of statistical discrimination of signatures from certain metalliferous regions: the tracing of production sources is often hampered by the fact that statistical discrimination of geochemical signatures is sometimes impossible because some minerals from distant regions have a close lead isotopic signature. These overlapping signatures of remote mining districts are generally referred to as signature overlaps. For example, the lead isotopic signatures of the Taurus Mountains in Turkey are very similar to those of Cyprus (Yener et al., 1991) but also to the Aegean region (Eshel et al., 2019). Consequently, in some studies, several possibilities of metal supply sources can be proposed without it being possible to decide in favour of one or the other without any other elements of distinction. Typological and/or geochemical data (trace elements) of the artifacts are then sometimes used in addition to lead isotopes.
If solutions to identify source mixtures are still lacking 8 , the solution found to reduce the overlaps is to characterize the full extent of the variability of the isotopic signature of the sampled mines. Archaeological databases therefore generally include several measurements per mine: the median number of analyses per mine is 17 in the Oxalid database; Stos-Gale and Gale (2009) recommends to conduct 30 to 50 analyses per mine. Thus, some historical mines such as the Laurion (Oxalid database) or the mining district of Mitterberg in Germany (Pernicka et al., 2016) were characterized by hundreds of mineral analyses.
Once the mines and large production areas have been delineated, provenance studies can be carried out. These methods have evolved considerably: during the 1980s, archaeologists compared the signatures of ores and artifacts by simple reading on two separate graphs 208 Pb/ 206 Pb vs 207 Pb/ 206 Pb and 206 Pb/ 204 Pb vs 207 Pb/ 206 Pb (see Stos-Gale and Gale (2009) for a precise description of the evolution of tracing process). In the early 1990s, this graphical reading was facilitated by the tracing of 95% confidence ellipsoids highlighting the signatures of the various mining districts. During the 2000s, these ellipsoids began to be traced by the Euclidean distance method and Kernel density, making their contours sharper (and statistically more accurate). However, these discriminations of different groups of mine populations are problematic: on mineral databases disconnected from any geological context, it seems almost impossible to differentiate some regions (Pampaloni, 2017).
Another limitation can be pointed: not all producing areas are present in the archaeological databases. A striking example is the southern part of the Massif Central at the end of Prehistory: although many mining sites and metallurgical workshops have been discovered, no isotope analysis has been carried out there to date outside the Cabrières site. The same is true for French Brittany, where there are many indications of copper exploitation of the early Bronze Age (Pailler et al., 2016) but for which archaeologists have not conducted any sampling. These areas which still lack data in archaeological databases have been analyzed by the BRGM (Bureau de Recherches Géologiques et Minières). Besides, some areas which do not show any evidence of ancient extraction are not necessarily devoid of exploitation sites: archaeological surveys are difficult to carry out in these areas, which are generally mountainous, wooded and little explored by archaeologists. Indeed, the identification of paleo-pollutions from local protohistoric (Bronze Age and late Latenian period) mining operations in the Morvan Mountains (Monna et al., 2004), has led to the identification of latenian mines in Bibracte; Bronze Age mines remain to be identified.
The use of analyses carried out by geologists can therefore be useful in filling gaps in archaeological databases. Moreover, the way geologists approach sampling could provide a new way of processing data and avoid the effect of signature overlaps.
3 Contribution of geological/gitological data One of the reasons given by archaeologists for not taking into account lead isotopic data from geological work is that they consider that these areas have too few analyses per deposit. Indeed, in geology, only one or two measurements are performed for simple and monogenic deposits (and therefore very homogeneous), and 7 to 10 for complex polygenic deposits (Marcoux, 1986, p. 288). However, this low number of analyses per mine is compensated by a wider coverage of the mineralized zone, which is then defined in its entirety and therefore in all its variability of signatures.
In other words, geological studies draw valuable information from the analyses performed because they characterize the fine variations of signatures within the same metallogenic province. The lack of analysis of small deposits in archaeological studies may in fact prove to be a limitation for the tracing of sources.
Very early used as geochronometers (Stacey and Kramers, 1975;Cumming and Richards, 1975 9 ), lead isotopes were fast utilized to study mineralization phenomena. In 1981, Zartman and Doe (1981) demonstrated that powerful orogenic and/or hydrothermal phenomena erase lead isotopic signatures from the rocks involved to create new closed systems 10 . Thus, isotopic signature of a deposit depends on 3 factors (Marcoux, 1986): the signature of the remobilized source(s); the age of the mineralization (radioactive decomposition occurs at the end of the mineralization process); the mineralizing fluid phase (two mineralizations of the same age but involving different fluids and/or rocks from different geological contexts will therefore have a different isotopic signature).
In more detail, signatures from deposits of different types but from the same region are closer together than synchronous deposits of the same type, scattered over a vast geological area. Indeed, the isotopic composition of lead seems to be more related to the geological environment than to the typology or chronology of mineral deposits: this phenomenon is called "regionalism" (Marcoux, 1986, p. 99).
This regionalism, as Marcoux (1986) pointed out, implies that an orogenic belt does not have a homogeneous isotopic signature 11 . Thus, at the scale of a large orogenic belt, the 8 Among recent work, Bray and Pollard (2012) and Longman et al. (2018) proposing statistical treatments in order to try to identify possible mixtures but they are still in their inception stages. 9 Those works offer isochronous evolution curves still used today to date geological contexts via radioactive decay laws. 10 The isotopic signatures of these systems correspond to a mixture of several deposition environments involved. An isotopic signature of lead is therefore a combination of multi-stage lead (from geological contexts of different ages). Marcoux (1986) characterizes this phenomenon under the term "lead mixture". 11 Only very large deposits and undisturbed hydrothermalsedimentary deposits have homogeneous signatures over long distances. variation in lead isotope ratios varies between 1 and 2%. Nevertheless, work on the scale of small geological sub-belts makes it possible to obtain a significant gain in terms of signature homogeneity: the variation in isotope ratios is then around 0.4%.
Geologists thus use the variability of lead isotopic signatures to study the relationships of filiation of deposits in terms of age or metallogenic sources 12 in a same mining district. The purpose here is therefore not to accurately characterize a mine but to compare the signatures of several deposits in the same mineralized belt. This fine description of the contexts makes it possible to finely separate the mineral populations and avoid the effects of overlaps that occur if we consider information without gitological data. This phenomenon can be illustrated by case studies, here we have chosen the Western Alps. The gitological contexts do not show a precise spatial distribution (Fig. 1).
Using those lead isotopes data, we can try to assign a geographical origin to a theorical object with the following signature: 206  A bivariate diagram drawn without any gitological information doesn't make it possible to discriminate the French Alps (round) and the Italian Tyrol (triangles). Besides, tracing 95% confidence ellipsoids doesn't allow a distinction to be made between the western Alps and the Italian Tyrol (Fig. 2). The presentation of the results by Euclidean distances would also not give a convincing result: the signature of the object being located at an equal distance from French and Italian ores.
Moreover, if here no conclusive provenance results can be obtained by graphical reading, the same is true if multivariate statistical treatments are used. As already mentioned by Sayre et al. (2001) (for Anatolia), the use of Gaussian mixtures does not lead to the creation of mineral population groups representative of reality. This can be explained by the fact that the distribution of lead isotopic signatures of deposits is not initially normal. A Shapiro-Wilk type normality test can be used to highlight this fact. Carried out on Variscan basement -Stratiform massive sulphide (which has the clearest distribution), the test gives a result of non-normality with an error threshold of less than 2%.
A solution allowing a clear distinction between these contexts can nevertheless be provided through the use of gitological data. Indeed, taking under account those data allow a more accurate response to our provenance study. This can be seen in a bivariate diagram: if we draw the diagram 208 Pb/ 204 Pb vs 206 Pb/ 204 Pb (in order to visualize the gitological contexts 14 ), we can note that the geological contexts are well individualized (Fig. 3): the different types of ore deposits can be underlines by their alignment along regression curves.
Three other bivariate diagrams can be traced. The geographical identification of the source of the theoretical object then appears obvious: the object comes from the French Alps and more precisely, from a Pb-Zn-Cu type 1 vein 15 (Fig. 4).
This example illustrates the fact that taking into account the gitological contexts makes it possible to differentiate deposits and therefore metal sources through the observation of linear regressions in diagrams including 208 Pb. In the absence of gitological data, and by having only geographical data (as is the case in the Oxalid database for example), it is impossible to distinguish regions that are geologically different. It should also be noted that a finer geographical division would also not have enabled the different source areas to be distinguished: gitological clustering over an area produces sub-populations whose variances are lower than the variance of the area as a whole.
12 Volcanogenic massive sulphide deposits from a same metallogenic region but whose formation ages are shifted over time thus have different signatures. 13 Deposits created in contact with magmatic (hot) and carbonate (cold) rocks.
14 As previously mentioned, this is due to the fact that the latter is the result of the disintegration of thorium. 15 According to our division. Moreover, thinking in terms of type of mineralization rather than considering each mine seems more coherent because some mines are linked to the exploitation of complex mineralization (linked to remobilization phenomena or successions of hydrothermal phases). As example, a wide dispersion of the signatures is observed for the French Alps Saint Véran copper deposit (data from Giunti (2011) andCattin (2008)) (Fig. 5). The mineralization presents important remobilization phases linked to the succession of 4 intense tectonic phases (Ancel et al., 2006).
It would thus appear more interesting to give an origin of precise gitological context to an artefact much more than a purely geographical origin. Perhaps should we rethink our reading of lead isotopes to return to what they characterize: a geological history. Admittedly it could be that the geographical precision of the origins is less precise but we would have a more accurate vision of the sources with less overlaps (but always the limit of source mixtures).
5 Perspectives: More precise interpretations possible by taking into account gitological data As previously mentioned, in bivariate reading groupings carried out in the absence of gitological data create heterogeneous population groups with wide ranges of isotopic signatures. The integration of new data can only create additional overlaps in these heterogeneous groups. Taking into account gitological contexts therefore makes it possible both to reduce the effect of overlaps and to obtain a very fine signature of mining districts (signature ranges are restricted and metallogenic contexts are highlighted by 208 Pb).
Thinking in terms of geology makes it possible to rethink the use of multivariate statistics which, as mentioned above, have not been used with only geographical data. Several types of statistical treatments can be applied.

Statistical treatments applicable to lead isotopic data supported by gitolological information
Since mid-2010, the team at the Padua laboratory has been providing archaeological tracing work accompanied by gitological data. (Artioli et al., 2016). This same team presents provenance studies in which Euclidean distances are traced in 3D (Artioli et al., 2014) which makes it possible to observe the variations of 3 ratios on the same graph 16 .
However, in order to assess the 3D data and propose relevant groupings of lead isotopic signatures, precise gitological data are required (the finer is the geological division, the narrower is the range of signatures per deposit type).  (Marcoux, 1986) are represented by circles and the Italian Tyrol (Nimis et al., 2012) by triangles. Each colour represents a type of ore deposit. These are not the only isotope analyses carried out on alpine ores, the other analyses are indicated by black crosses (data from Höppner et al., 2005;Cattin, 2008;Giunti, 2011;Artioli et al., 2016;Pernicka et al., 2016). 16 In addition, this team standardizes isotopic ratios on 204 Pb and not on 206 Pb to better distinguish groups of mineral populations. This type of treatment is very effective in known gitological contexts and defines production zones with precision at the scale of a mining district or even the mine.
But this is not the only type of statistical treatment possible. If a MANOVA is not possible because the populations are not perfectly Gaussian, a treatment associating PCA then HCA (presented below) or a Discriminant Factorial Analysis, could be envisaged to determine the sources (Tomczyk et al., 2019). A Discriminant Factorial Analysis has the advantage of blocking known contexts. However, it requires that the population groups have a high inter-class variability, a low intra-class variability and that these groupings be represented by sufficient samples (each gitological context must be represented by at least 3 to 7 data depending on the type of deposit).
In addition, there is still a limit to the processing of the thousands of isotopic data already present in archaeological databases and lacking any gitological context. In addition to tracking production sources, HCA treatments give good results and can propose classes that are fairly consistent with the gitological contexts.

Multivariate statistics taking into account the 5 isotopic ratios
We are thus applying a multivariate statistical treatments taking into account all the 5 isotopic ratios of lead commonly used by archaeologists and geologists ( 207 Pb/ 206 Pb,208 Pb/ 206 Pb,206 Pb/ 204 Pb,207 Pb/ 204 Pb and 208 Pb/ 204 Pb) 17 . Our goal is to test if a multivariate treatment allows to find the gitological contexts in addition to testing the origin of the theoretical object (and to verify that the result is the same as the one obtained by graphical reading). Indeed, taking into account the 5 ratios stretches the statistical model (Fig. 6).
A Hierarchical Cluster Analysis (HCA) performed on the coordinates of an PCA involving the 5 reports (standardized) allows to observe the similarity of the isotopic signatures of the different gitological contexts 18 .
The consideration of 5 isotopic ratios underlines a fairly clear distinction between the different geological contexts (Fig. 7). The Italian Alps ore deposits are mainly distributed in a large statistical class. The gitological contexts that appear clear in graphic reading (Stratiform or Variscan sulphides deposits for example) are also clearly distinguished in the sub-classes of the dendrogram. The proposed statistical treatment thus makes it possible to find the contexts when they do not present important overlaps. Contexts with very similar signatures are distributed in the same subclass (Ladinian, remobilization and Mozoic stratiform deposits are grouped in the same subclass; this is also the case for part of the Mozoic cover and Permian sulphide ore deposits). The use of a HCA thus seems to be a reliable reflection of the gitological contexts. It could be used when the latter are missing, but nevertheless allows only little to go beyond the signature overlaps.
Moreover, if one seeks to affiliate the theoretical object, it is possible to assume the gitological context from which the metal used to produce the theoretical object originates: the object is located in a fairly well-defined class corresponding to the Pb-Zn-Cu type 1 veins (as previously identified).
Thus, far from being a perfect solution to retrieve metallogenic contexts, the use of multivariate statistics could provide an approximation of the geological reality when the latter is missing. However, it doesn't worth an extensive field   Finally, it is important to point out that in the case where gitological information is present and not researched, the latter could be integrated into a statistical treatment allowing simultaneous processing of quantitative (isotope ratios) and qualitative (gitological) data. One possibility could consist in the use of a multiple correspondence analysis (MCA) performed on a table of variables with qualitative modalities gathering the variable (natively qualitative) of the gitology as well as the variables of isotopic ratios (put in classes so that they can be treated as qualitative modalities). This type of approach is quite classical for data from the social sciences and humanities where the problem of simultaneous treatment of Fig. 6. Correlation circle of the Component Analysis illustrating the contribution of taking into account 5 isotopic ratios (based on data from Marcoux (1986) and Nimis et al. (2012)): 206 Pb/ 204 Pb is strongly anti-correlated with the ratios normalized on 206 Pb, which allows the angle to be opened. The ratios normalized on 206 Pb are then the extremes of the model. Fig. 7. Hierarchical Cluster Analysis (HCA) using the 5 lead isotopes ratios perform on Marcoux (1986) and Nimis et al. (2012) data. Certain types of deposits are subdivided into different subclasses. Some are widely dispersed and lose cohesion (e.g. remobilization). The theoretical object is associated with one of the two subclasses of the Pb-Zn-Cu type 1 veins. quantitative and qualitative data frequently arises. A recent example where this type of treatment has been used is Vanlandeghem et al. (2020).

Conclusion
A detailed characterization of gitologic contexts makes it possible to refine statistical groupings of ore data and avoid the effects of signature overlaps. Far from throwing a blur because of the small number of samples per mine, taking into account the gitological contexts therefore makes it possible to improve lead isotopic source tracing. It allows a better discrimination of mineral populations and reduces signature overlaps. It would thus be appropriate to rethink the interpretations of provenance by first defining potential (and incompatible) type of deposits and then, in a second stage, by proposing hypotheses concerning geographical origins.
There is still a limit to the processing of the thousands of isotopic data already present in archaeological databases and lacking any gitological context. If statistics could be used to generate statistical groupings close to the real contexts, as we just showed in this paper, it is however important to update these databases when possible and, moreover, to consider the geological contexts in the scope of future mineral sampling.