Devising quality assurance procedures for assessment of legacy geochronological data relating to deglaciation of the last British-Irish Ice Sheet

This contribution documents the process of assessing the quality of data within a compilation of legacy geochro- nological data relating to the last British-Irish Ice Sheet, a task undertaken as part of a larger community-based project(BRITICE-CHRONO)thataimstoimproveunderstandingoftheicesheet'sdeglacialevolution.Asaccurate reconstructions depend on the quality of the available data, some form of assessment is needed of the reliability and suitability of each given age(s) in our dataset. We outline the background considerations that informed the quality assurance procedures devised given our speci ﬁ c research question. We describe criteria that have been usedtomakeanobjectiveassessmentofthelikelihoodthatanageisin ﬂ uencedbythetechniquespeci ﬁ csources of geological uncertainty. When these criteria were applied to an existing database of all geochronological data relating to the last British-Irish Ice Sheet they resulted in a signi ﬁ cant reduction in data considered suitable for synthesis. The assessed data set was used to test a Bayesian approach to age modelling ice stream retreat and we outline our procedure that allows us to minimise the in ﬂ uence of potentially erroneous data and maximise the accuracy of the resultant age models. main sources of uncertainty; i) processes, other than radioactive decay, that in ﬂ uence the 14 C/ 12 C ratio in the organism (before or after death), and ii) processes that result in the age of the sample not accurately re ﬂ ecting of the age of the adjacent sediments. Consideration (i) relates to chemical processes that include isotopic fractionation, recrystallization, contamination and reservoir effects. Consideration (ii) relates to the physical processes through which the sampled material was transported and deposited which give rise to its geological context with respect to an event of interest.


Introduction
Numerical ice sheet models provide insights into the response of ice sheets around the globe to various global warming scenarios, but these models have to be validated through comparison with field evidence relating to the evolution of former ice sheets (Stokes et al., 2015). The accurate reconstruction of rates and patterns of deglaciation is, in turn, fundamentally dependent on the quality of the geochronological data that provides a temporal framework. As early as the 1950s advances in radiocarbon ( 14 C) dating permitted glacial events to be constrained in absolute time (e.g. Flint, 1955;Godwin and Willis, 1959). In the subsequent decades, palaeo ice sheets around the world became better constrained by steadily rising numbers of absolute geochronometric ages, firstly by 14 C and then by luminescence (e.g. Berger and Eyles, 1994;Duller et al., 1995) and cosmogenic dating (e.g. Phillips et al., 1990Phillips et al., , 1994. When age measurements were scarce glaciological reconstructions of entire sectors often hinged on a small number or even individual ages. A classic example for the British-Irish Ice Sheet (BIIS) were the Dimlington ages for maximum ice advance in East England (Penny et al., 1969). As more ages became available it became apparent that ice sheets did not reach their maximum extents or retreat synchronously. Subsequently, ice sheets became a focus for investigation to improve understanding of global climatic teleconnections (e.g. Denton and Hendy, 1994;Gosse et al., 1995;Osborn et al., 1995;Ivy-Ochs et al., 1999, Barrows et al., 2007Moreno et al., 2009) The ever-increasing accumulation of legacy geochronological data is spread across hundreds of different publications, making it difficult to address regional or ice sheet scale reconstruction; this can be termed the Compilation Problem. It has recently been addressed for many ice sheets including the Laurentide Ice Sheet (Dyke et al., 2002), the British-Irish Ice Sheet (Hughes et al., 2011), the Antarctic Ice Sheets (Bentley et al., 2014) and the Eurasian Ice Sheets (Stroeven et al., 2015;Hughes et al., 2016). These geochronological compilations reveal how numerous age constraints have become. However, they can also reveal incompatibility or direct conflicts between ages. Such conflicts are not surprising for two reasons. Firstly, dating techniques and their robustness have vastly improved over time (Lowe and Walker, 2015). Secondly, the geological context of the material sampled for dating might have more than one interpretation, or the strength of the association between an age and the event that is of interest may vary. Both factors yield conflicts in specific regions that have often forced authors of a reconstruction to rely on some ages but argue against others. It is apparent that not all legacy ages are equally-robust with respect to addressing a specific research question; this can be termed the Quality Problem.
The issue of quality assurance of geochronological data has received considerable attention in various areas of science including the archaeological (e.g. Pettitt et al., 2003;Blockley et al., 2008;Graf, 2009) and paleoclimatological (e.g. Lowe and Walker, 2000;Brauer et al., 2014) communities with the radiocarbon technique specifically receiving much attention. These studies have highlighted a range of issues that can influence the quality of geochronological data and many have gone on to define set criteria for assessing the reliability of data (e.g. Pettitt et al., 2003;Graf, 2009;Blockley and Pinhasi, 2011). Within archaeology and paleoclimatology much of the focus has been on assessing data that exists in very close association with other data (e.g. 14 C dates from a sequence) and more rarely with sparsely distributed data. Additionally, the resolution that is sought is often on the order of 10 1 -10 2 years. For the reconstruction of past ice sheets this concentration of data from a single location and achievable resolution is desirable but also generally rare. While some glaciological compilations have been provided with internal quality assurance, the DATED project (Hughes et al., 2016) being a recent and commendable example, little has yet been published in the ice-sheet literature about the underpinning decision making criteria and pragmatic approaches to the task.
A large consortium of researchers (N 45) are currently working on the BRITICE-CHRONO project to better constrain the retreat history of the BIIS (Clark, 2014), acquiring new ages and appraising the existing legacy data (Hughes et al., 2011). In order to inform ice sheet reconstructions, and to feed into future numerical modelling, a systematic approach to how all ages are to be used has been devised to address the 'Quality Problem'. It is the purpose of this paper to outline the guidelines used to assess a legacy data set and the criteria devised for doing so. A review is provided of the issues that can introduce geological uncertainty into dating deglaciation by the most commonly applied techniques. We outline how consideration of these was used to create techniquespecific guidelines and criteria for assessing geochronological data for constraining rates and patterns of deglaciation. We integrated the assessed data with Bayesian age modelling and outline a procedure for maximising the confidence that can be achieved in the results.

Dating deglaciation
Observations of current ice margins (e.g. in Antarctica) can robustly and directly constrain the timing of ice advance and retreat on annual timescales (Rignot et al., 2014), but such observations are limited to the last few decades over which we have aerial photographs and satellite images. The need to understand the longer-term significance of observed changes in modern ice sheets demands a means to reconstruct changes in ice sheets over timescales relevant to deglaciation; i.e. 10 2 -10 5 years (Stokes et al., 2015). However, beyond the limits of direct observations there is no geochronological technique that can directly constrain the timing of glacial advance or retreat, rather we date features within the geomorphological and sedimentary record ( Fig. 1) that are formed before, during or after deglaciation, and can thus be directly (e.g. an exposed glacial surface) or indirectly (e.g. fluvioglacial outwash) linked to past ice extents. Geochronological control on such features can represent minimum or maximum ages for deglaciation depending on the geomorphic and/or stratigraphic context of the sample collected and the quality of the ages fundamentally influences subsequent interpretation (Fig. 2).
Within any compilation of geochronological data an unknown proportion of measured samples will be affected by factors that can make the resulting ages inaccurate. Ages obtained from chronological methods are derived from the measurement of specific physical properties (e.g. the ratio of 10 Be/ 9 Be in cosmogenic nuclide dating). The actual measurement of a physical property has a set of defined systematic and random uncertainties associated with processing and measurement which are reflected in the quoted error term that accompanies the reported result. The measured physical property(s) are used to calculate an equivalent calendar age that is then assumed to be contemporaneous with, or constrain, the age of the event of interest. Wrapped up within these assumptions of equivalence are other sources of uncertainty which can be broadly separated into two categories; factors that could have affected the measured property before sampling and over which workers have limited control, and the strength of the geological association between the material that is being dated and the event of interest. While recognizing that there is a wide range of potential sources of uncertainty that fall under these two categories we use the broad term 'geological uncertainty' to describe both. We consider this appropriate as in both categories it is the geological history or context of the sampled material that is the source of the uncertainty. As it is not possible to quantitatively constrain all sources of geological uncertainty there is no guarantee that an age derived from a measurement will be an accurate and/or precise constraint on a geological event. Every geochronological technique measures different material in different settings and as such they implicitly suffer from different sources of geological uncertainty.
Numerous geochronological techniques have been utilised to investigate past ice extent and other related questions. However, the vast majority of available ages are contributed by three techniques namely; cosmogenic exposure dating, luminescence dating and 14 C dating. As such we focus further discussion on these methods. Other techniques, such as tephrochronology have high potential for providing age constraints on sediments associated with glacial retreat however, as yet it has seen relatively little systematic application in constraining ice sheet deglaciation (cf. Dugmore, 2006, 2008) and there are no tephrochronology data within the BRITICE v1 database. Lowe (2011) provides a detailed review of tephrochronology and the interested reader is directed there.

Cosmogenic nuclide exposure ages
(Terrestrial) cosmogenic nuclides (CN) are produced by interactions between minerals exposed at the Earth's surface and secondary cosmic radiation. A variety of isotopes are produced by these interactions including radioactive 10 Be, 26 Al, 36 Cl, 14 C and the stable noble gas isotopes 21 Ne and 3 He. The differing properties of the various CN (i.e. differing half-lives and production rates) result in them being used to address a range of geochronological and geomorphological questions while their various production mechanisms allow them to be applied to a wide variety of lithologies. Of the CN available to researchers 10 Be, 26 Al and, 36 Cl are, by some distance, the most widely applied to constraining the past extent and deglaciation of ice sheets (e.g. Stone et al., 2003;Bentley et al., 2006;Briner et al., 2003;Ballantyne et al., 2009a;Svendsen et al., 2015). 10 Be and 26 Al are produced within quartz through spallation (of O and Si respectively) and muon capture (Gosse and Phillips, 2001). The primary production pathways of 36 Cl are spallation of K and Ca, muon capture by K and Ca and, thermal neutron capture by 35 Cl (Zreda et al., 1991;Schimmelpfennig et al., 2009). The differing production pathways of 36 Cl mean it can be applied to rocks that are quartz poor, including carbonates and mafic igneous rocks (e.g. basalts). Due to the differing properties of the particles involved the relative importance of the production pathways changes with depth beneath the Earth's surface. Most CN studies use 10 Be to determine exposure ages as 26 Al measurements have larger measurement uncertainties (Gosse and Phillips, 2001) (n.b. 26 Al is most commonly applied alongside 10 Be as a test for complex exposure histories). The relatively ubiquitous occurrence of quartz within crustal rocks results in 10 Be being the CN most widely applied to constraining ice sheet extent.
CN exposure dating has been widely applied to the majority of extant and former ice sheets (including the BIIS) and has made a significant contribution to our understanding of their past extents and evolution through time (cf. Balco, 2011). The focused production of CN within the top few metres of the Earth's surface makes them particularly useful for exposure dating features related to past ice sheets such as moraine boulders (e.g. Small et al., 2012), glacially transported boulders (e.g. Fabel et al., 2012) and, glacially modified bedrock (e.g. Stone and Ballantyne, 2006). When an ice margin retreats and first exposes the material CN production begins. The CN concentration within samples taken from these surfaces can then be used to calculate an exposure age which, if all assumptions hold, will closely equate to the time of deglaciation.

Obtaining a CN exposure age
Analysis for CN exposure dating is undertaken by accelerator mass spectrometry (AMS) which measures a ratio (e.g. 10 Be/ 9 Be) for the isotope of interest and this ratio is then used to calculate the concentration of the CN, reported in atoms per gram. Knowledge of the CN concentration allows calculation of an exposure age when combined with knowledge of the production rate. Numerous studies have attempted to establish production rates for the various nuclides (Nishiizumi et al., 1989(Nishiizumi et al., , 1996Masarik and Reedy, 1995;Swanson and Caffee, 2001;Schimmelpfennig et al., 2009) and improving constraints on production rates is an ongoing field of research (e.g. Small and Fabel, 2015;Marrero et al., 2016a). Currently there are two online calculators that can calculate exposure ages from the calculated CN concentration and relevant sample data (e.g. latitude, longitude, elevation, sample thickness, sample density, topographic shielding correction factor, AMS standard). The most widely used of these is the CRONUS-Earth online calculator (Balco et al., 2008;http://hess.ess.washington.edu). Users have the option of calculating 10 Be exposure ages using a globally calibrated Fig. 1. Simple schematic of a deglaciated landscape depicting some of the key geomorphic and stratigraphic scenarios where ages approximating deglaciation can be obtained using one of the three main geochronological techniques (TCN, 14 C, and OSL). Additional constraints can be obtained from particular marine sediments such as IRD layers and glacigenic sediments. Normal text depicts minimum deglaciation ages, italics depict maximum deglaciation ages. production rate or a local production rate (e.g. Balco et al., 2009;Putnam et al., 2010;Young et al., 2013;Small and Fabel, 2015). The other online calculator (CRONUScalc; Marrero et al., 2016b; http://web1.ittc.ku.edu: 8888/2.0/html) includes functionality for calculating exposure ages with nuclides other than 10 Be and 26 Al. It also includes the option to scale production rates using the newly available Lifton-Sato-Dunai scaling scheme (Lifton et al., 2014). It currently does not include online functionality to calculate exposure ages using a user defined production rate.
The various options available for calculating exposure ages in terms of choice of production rate, scaling factors and now, calculation method are likely to result in a variety of approaches being taken in the literature. For data compilations for use in ice sheet reconstructions it is vital that there are sufficient supporting data describing the analytical procedures, primary data collected and, calculation methods. This allows published ages to be recalculated so that all CN ages and their uncertainties are comparable within a given geochronological compilation. For our purposes we recalculated 10 Be exposure ages using the CRONUS-Earth calculator (Balco et al., 2008) using the Lm scaling scheme and erosion rates as specified by the original authors for all samples. We use a local production rate derived from Fabel et al. (2012).
CN exposure ages are reported in years before present (a or ka). Ages are generally reported with two uncertainty values (both at ±1σ); the internal uncertainty reflecting the uncertainty on the measured AMS ratio and the external uncertainty which includes systematic uncertainties including production rate uncertainties and uncertainties introduced by sample processing (Dunai, 2010). The internal uncertainty is commonly used to assess consistency between ages obtained from a single location and processed together through the same lab, where systematic uncertainties can be taken as being equal for all ages. External uncertainties are used for comparison to exposure ages from other locations and for comparison to other dating techniques.

Sources of geological uncertainty
Assumptions made regarding the exposure and shielding history of any given sample have the potential to introduce significant geological uncertainty to any given exposure age. For the purposes of constraining ice sheet retreat these assumptions are that there has been no prior exposure of a surface to cosmic radiation (inheritance; Fig. 3) and that since deglaciation exposure has been continuous and constant (partial exposure) (Balco, 2011). Sampling for cosmogenic exposure dating requires careful selection of samples to minimise such complicating issues. Studies that address first order questions, such as whether an area was ice free during the Last Glacial Maximum (LGM), are relatively insensitive to geological uncertainty (e.g. Ballantyne, 2010). In contrast, high-resolution reconstructions of rates and patterns of deglaciation require geochronological data that reflect the true age of deglaciation as accurately as possible, thus all sources of geological uncertainty potentially become significant. Inheritance occurs where there is insufficient removal of material to completely remove any existing accumulation of CN. In glacial landscapes it is most commonly encountered when sampling landforms formed predominantly by glacial abrasion (e.g. Briner and Swanson, 1998), where sub-glacial erosion rates can be relatively low (Hallet, 1979). It is less common when sampling landforms formed by glacial plucking (Colgan et al., 2002) and rare when sampling glacially transported boulders of sub-glacial origin ( Fig. 4) (Putkonen and Swanson, 2003;Heyman et al., 2011), given the higher rates of erosion associated with these processes (Hallet et al., 1996). While there has been a general focus on boulders (Balco, 2011), many studies have sampled other glacial landforms, and any compilation is likely to feature a range of sample settings where inheritance is possible and thus its likelihood must be assessed.
'Partial exposure' describes a wide range of processes that can act to reduce the rate of CN accumulation following landform exposure or, in the case of erosion, remove material containing a proportion of the accumulated CN inventory. To reduce the rate of CN accumulation requires the sampled material to have been shielded by a material, attenuating the incoming cosmic radiation with a concomitant reduction in the production rate. Water, snow, soil, and vegetation can all attenuate cosmic radiation and, if cover is sufficiently thick, attenuation can be total (Gosse and Phillips, 2001). Erosion of an exposed surface removes material containing a proportion of the accumulated CN and reveals material that was previously shielded. This newly exposed material was accumulating CN at a lower rate than the removed material because of the attenuation of cosmic rays with depth, and thus has a CN inventory lower than expected if there had been no erosion. Consequently, if no account is taken of erosion, sampling this material will result in an underestimation of the true exposure age. In general rates of erosion on crystalline rocks in glacial environments are quite low at c. 2 mm ka −1 (André, 2002), but other more friable lithologies, sporadic A hypothetical deglaciation sequence initially populated with ages that are taken at face value to reconstruct changes in ice extent in a palaeoclimatic context using the NGRIP δ 18 O record (Rasmussen et al., 2006). Addition of new data fundamentally changes the interpretation of the relationship between changes in ice extent and climate. In this hypothetical example the large aliquot OSL age is several ka older than the age obtained using the single grain approach (e.g. Duller, 2006Duller, , 2008. The bulk radiocarbon age is older than the radiocarbon age obtained from a macrofossil in the same horizon (e.g. Grimm et al., 2009). The TCN ages in the original scenario reflect nuclide inheritance which is only identifiable if sufficient samples exist (e.g. Everest et al., 2013). While this example is deliberately extreme it is intended to highlight that the availability of data exerts a fundamental control on the subsequent reconstruction. episodes of exfoliation (Zimmerman et al., 1994) and granular disintegration (Kirkbride and Bell, 2010) can lead to higher erosion rates.
As there is no way of quantitatively assessing the potential for inheritance and/or partial exposure in a single exposure age, workers generally focus on obtaining multiple ages from the same feature (Putkonen and Swanson, 2003;Balco, 2011). This allows the use of various approaches to identify and exclude outliers that can reasonably be argued on statistical (e.g. Chauvenet's criterion) and/or geomorphic grounds to have been affected by geological uncertainty. Ages that cannot be excluded, particularly if they cluster tightly, can then be judged as representative of the true exposure age of the sampled landform. Balco (2011) provides a useful review of the differing approaches used to assess datasets for the effects of geological uncertainty the most commonly applied being the reduced chi-squared statistic x R 2 which is given by; Where t 1 …t n are a set of apparent exposure ages, σ 1… σ n are the corresponding measurement (internal) uncertainties and t avg is the arithmetic average of the apparent exposure ages. This statistic compares the observed scatter within a dataset to the scatter expected from measurement uncertainty alone. For a dataset with infinite degrees of freedom, x R 2 ≈ 1 if measurement uncertainty is the sole cause of the observed scatter (Bevington and Robinson, 2003); but for more restricted datasets with fewer degrees of freedom (such as geochronological data) higher x R 2 values are associated with acceptable p-values (Bevington and Robinson, 2003; their Table -C4). If the x R 2 is higher than the appropriate threshold then it is inferred that geological . a) The sample has been completely shielded from cosmic rays prior to glaciation and continuously exposed since deglaciation. b) Sample is exposed to cosmic rays prior to glaciation and experiences no post-glacial shielding (prior exposure) the apparent exposure age will exceed the deglaciation age. c) Sample is completely shielded from cosmic rays prior to glaciation and partially shielded from cosmic rays following deglaciation (incomplete exposure) the apparent exposure age will be younger than the deglaciation age. uncertainty is contributing to the observed scatter. The age of a given landform is usually defined by the ages that cluster together and generally taken as the arithmetic mean. Error-weighted means are dominated by ages with lower AMS uncertainties. As there is no reason to assume such ages are more accurate with respect to dating an event of interest use of error-weighted means is not favoured (Brauer et al., 2014).

14
C is produced in the upper atmosphere through the interaction of 14 N and cosmic radiation. 14 C readily binds with O to produce CO 2 after which it mixes through the atmosphere, is absorbed into the ocean and, through photosynthesis and the food chain, becomes fixed in living organisms. Unlike the other carbon isotopes ( 12 C and 13 C) 14 C is radioactive thus following death the concentration of 14 C (and the ratio of radioactive/stable isotopes) within an organism decreases at a known rate. Using these characteristics a measurement of 14 C from a deceased organism can be used to calculate the time since death (Libby et al., 1949;Arnold and Libby, 1951). This ability to date organic material has been extensively applied in a wide variety of fields where geochronological data that constrains the age of an object (e.g. archaeology) or a deposit (e.g. archaeology, paleoenvironmental [including glacial] studies) is vital.
Within paleoenvironmental studies radiocarbon has been widely applied to date sedimentary archives containing proxy records that can be related to past climate change (e.g. Lowe et al., 2004;Wohlfarth et al., 2006;van Asch et al., 2012). In this context samples are taken from various depths within a sequence and ages from these used to constrain the timing of observed changes in paleoenvironmental proxies (e.g. air temperature reconstructions, δ 18 O records). 14 C ages may also be used to assist with 'wigglematching' to a regional stratotype such as the Greenland ice core records (e.g. Small et al., 2013). Radiocarbon has also provided valuable constraints on the past evolution of ice sheets (e.g. Dyke et al., 2002;Ó'Cofaigh and Evans, 2007;Lowell et al., 2009). This is despite the obvious limitation that glaciers and ice sheets do not directly deposit organic material, thus any organics found in association with glacial deposits must have lived some time before or after the glacial event.
Radiocarbon can be used to constrain past ice sheet evolution in a variety of ways. Ice advance can be constrained by dating organic material reworked into glacial deposits (Ó'Cofaigh and Evans, 2007) with such ages providing a maximum age for ice advance. Constraining deglaciation using radiocarbon is most commonly achieved by dating basal organics with close association to glacial deposits (e.g. Dyke, 2004;Lowell et al., 2009). These ages provide a minimum age of deglaciation. Similarly almost any 14 C age can be interpreted as a minimum deglaciation age if it is not stratigraphically overlain by glacial deposits. In this way basal ages from sedimentary sequences sampled for other paleoenvironmental studies can be included in geochronological compilations for use in constraining glaciation even if there is not a close association with glacial deposits.

Obtaining a 14 C age
To obtain a 14 C age a sample of organic material is taken from the horizon of interest. Radiocarbon analyses are undertaken on a variety of material including bulk organic sediments, wood, charcoal, bone, seeds, leaves and, marine macro-and micro-fossils. The sample may be collected in the field but is more commonly extracted from processed samples (e.g. monoliths or cores) in a laboratory. The wide variety of material suitable for radiocarbon dating requires careful sample selection, where such a choice exists, as the differing life environments of organisms can result in them producing variable 14 C ages (cf. Lowe et al., 2001;Walker et al., 2001). Following sampling material can be subject to various pre-treatments, which can influence the accuracy of the 14 C age (e.g. Bird et al., 1999;Jacobi and Higham, 2008;Blockley et al., 2008). It is then prepared for analysis either through traditional beta-counting techniques (based on decay of 14 C) or through AMS measurement of carbon isotope ratios. Radiocarbon dating carried out through beta-counting can produce ages of comparable accuracy to AMS (Walker et al., 2001;Boaretto et al., 2002) thus there is no intrinsic reason to favour ages produced by one technique over the other. AMS does however allow for measurement of 14 C in considerably smaller samples. Thus, while traditional beta-counting techniques requires several grams of carbon, AMS analyses can be undertaken on mg of material. Results are reported as conventional radiocarbon ages ( 14 C yrs BP [before 1950]) which include a δ 13 C correction for isotopic fractionation (Stuiver and Polach, 1977). Ages are reported at ±1σ with uncertainties reflecting counting statistics and corrections.
Due to changes in the atmospheric 14 C/ 12 C ratio over time (Stuiver and Suess, 1966) and, by convention, the use of the original 'Libby' half-life value (Stuiver and Polach, 1977) 14 C ages ( 14 C yrs BP) do not equate to calendar years and require calibration. Radiocarbon calibration curves are constructed by comparing raw 14 C data to independently acquired calendar ages, ideally an absolute record that has directly incorporated carbon from the atmosphere at time of formation (Reimer et al., 2013). Tree rings are optimal records for radiocarbon calibrations as they can be independently dated through dendrochronology and the tree ring based calibration curve now extends to 13.9 ka (Reimer et al., 2013). The incorporation of other records (e.g. speleothems, varved records) has resulted in the most recent calibration curve (INTCAL13) extending over the last 50 ka (Reimer et al., 2013). Calibration curves have been extended and refined over time resulting in a variety of different calibration curves being applied to 14 C ages in the literature (e.g. Reimer, 1986, 1993;Reimer et al., 2004Reimer et al., , 2009Reimer et al., , 2013Fairbanks et al., 2005). Data calibrated using one curve is not, sensu stricto, directly comparable to data calibrated using another and differences in the resulting calibrated ages (cal a BP) have the potential to hinder comparison of published geochronological information.
A variety of software exists for the calibration of radiocarbon data including OxCal (Bronk Ramsey, 2013; http://c14.arch.ox.ac.uk/embed. php?File=oxcal.html) and CALIB (Stuiver and Reimer, 1993; http:// www.calib.org). Calibration of conventional 14 C ages results in an output of possible corresponding calendar ages. The variations in the 14 C/ 12 C ratio over time manifest as 'wiggles' in the calibration curves that can cause some 14 C measurements to have multiple possible calibrated ages (Fig. 5). With a carefully constructed sampling strategy comprising several closely spaced samples those wiggles can be used to improve the certainty of the calibration  however, most legacy data has insufficient sampling density to utilise the wiggles constructively.

Sources of geological uncertainty
Lowe and Walker (2000) identify three sources of error in 14 C ages; 1) Calibration to calendar years, 2) Laboratory contamination (and measurement precision) and 3) Site specific geological problems. Calibration uncertainties are controlled by the accuracy of the applied calibration curve and are intrinsic with the radiocarbon method. By standardising the calibration applied within a geochronological compilation these uncertainties are consistent within a data-set and, by declaring raw measurements and uncertainties within the compilation, calibrations can be updated. Data compilers (and users) are clearly unable to influence uncertainties introduced by laboratory contamination and/or procedures. The radiocarbon community has undertaken an extensive program of quality assurance to give confidence in the comparability between results and amongst laboratories (e.g. Long and Kalin, 1990;Scott et al., 2010). While some older results may not have been subject to such rigorous procedures it is impractical to attempt an ad hoc assessment of comparability between results within a geochronological compilation. Given this it is most practical to treat all 14 C ages within a geochronological compilation as being comparable.
'Site specific geological problems' are the most relevant for undertaking quality assurance on legacy data. This category comprises two main sources of uncertainty; i) processes, other than radioactive decay, that influence the 14 C/ 12 C ratio in the organism (before or after death), and ii) processes that result in the age of the sample not accurately reflecting of the age of the adjacent sediments. Consideration (i) relates to chemical processes that include isotopic fractionation, recrystallization, contamination and reservoir effects. Consideration (ii) relates to the physical processes through which the sampled material was transported and deposited which give rise to its geological context with respect to an event of interest.
Several chemical processes are important considerations for obtaining accurate 14 C ages. Isotopic fractionation occurs naturally due to the preferential uptake of light isotopes (i.e. 12 C, 13 C) over heavier isotopes (i.e. 14 C) in some biochemical processes (Craig, 1953). It can be corrected for by measuring 13 C/ 12 C in the sample and normalising to an agreed value (Stuiver and Polach, 1977). This produces a standardised δ 13 C (‰) value that reflects the immediate environment in which the sample originated. With knowledge of the expected δ 13 C value for a given environment this can be used to assess the impact of other chemical processes that have altered the 14 C levels of a sample since death such as recrystallization of shells, contamination, and mixing of material from differing environments (i.e. terrestrial and aquatic photosynthesisers). This provides a potential means of undertaking quality assurance of 14 C ages within a geochronological compilation by incorporating δ 13 C values in assessment criteria. However, not all radiocarbon data is reported with δ 13 C values and in some cases there may not be sufficient contextual data to make post hoc assessments of whether a δ 13 C value is anomalous.
Another category of chemical processes particularly relevant for quality assurance of legacy data is 'reservoir effects' (Stuiver and Polach, 1977). These occur where organic material fixes carbon from a source other than the atmosphere. They can occur where carbon is sourced from rocks (hardwater effect), redeposited organic material, or from ocean water (Marine Reservoir Effect [MRE]) (Deevey et al., 1954;Mangerud and Gulliksen, 1975;Olsson, 1986).
Due to the extended residence time and large reservoir of 14 C within the ocean, marine samples are depleted in 14 C with respect to the atmosphere and, consequently, produce 'old' 14 C ages (the MRE). The offset between marine and terrestrial 14 C ages varies spatially and temporally ( Fig. 6) thus there is no single, universal correction factor (Austin et al., 1995(Austin et al., , 2011Waelbroeck et al., 2001;Björck et al., 2003;Bondevik et al., 2006). Given the current state of knowledge some workers quote marine 14 C ages with bracketing maximum and minimum potential reservoir corrections (e.g. Small et al., 2013). While we acknowledge issues with establishing the precise timing of variations in the marine reservoir effect during the last deglaciation (e.g. Austin et al. (2011) tune their record to Greenland, assuming synchronicity), the magnitude of changes observed provides a reasonable guide for the choice of maximum/minimum corrections to be applied. The hardwater effect (Deevey et al., 1954) is caused by 14 C-depleted carbonate ions (i.e. from carbonate rocks) dissolving in freshwater and diluting the 14 C concentration of the dissolved inorganic carbon in the water. When this carbon is taken up by aquatic organisms their 14 C age is older than contemporaneous terrestrial organisms (e.g. Shotton, 1972;Child and Werner, 1999). The scale of the hardwater effect at a location can fluctuate over time due to changes in groundwater flow, the height of the water table and precipitation thus a modern analogue provides only a rough guide to what the hardwater effect may have been in the past.
A final issue relevant to consideration (i) is that an improving appreciation of the complexities of 14 C dating reveals that legacy ages may be erroneous due to incomplete understanding of certain issues at the time. An example of this is the application of ultrafiltration pretreatments to Palaeolithic bones which improves the accuracy of 14 C ages obtained. A program of re-measurement of bones dated previously has revealed significantly different, and generally older, results when ultrafiltration was used (Jacobi and Higham, 2008). Similarly, Blockley et al. (2008) find that appropriate pre-treatments are crucial for obtaining accurate 14 C ages in certain settings.
Consideration (ii), above, stems from the fact that organic material does not always occur in life position in close association with sediments of interest. Consideration (i) notwithstanding, 14 C provides accurate ages on the time of death; however, the target material potentially lived an indeterminable amount of time before or after the geological event of interest (glaciation or deglaciation in this case) introducing an inherent level of geological uncertainty. Additionally, geological processes affect the position of organic remains within a sedimentary sequence thus there can be an unknown discrepancy between a 14 C age and the true age of the deposit. Processes such as reworking, redeposition and time transgressive deposition result in materials being ex situ. Their effect is demonstrated by significant age differences between different datable materials from the same horizon (e.g. Heier-Nielsen et al., 1995;Grimm et al., 2009;Hågvar and Ohlson, 2013).
Materials datable by 14 C can be broadly categorised; a) bulk samples where organic material extracted from sediment is dated, b) microfossil samples requiring microscopy to extract, and c) macrofossil samples that can be visually identified and sampled. As each category has differing potential for being affected by the sources of geological uncertainty outlined above one approach that can be taken to conducting quality assurance of radiocarbon data-sets is the construction of radiocarbon 'inventories' (cf. Lowe et al., 2001). Such inventories consist of 14 C ages obtained from a variety of materials within the same horizon(s) in a sedimentary sequence (e.g. Turney et al., 2000;Walker et al., 2001Walker et al., , 2003Lowe et al., 2004) allowing identification of problematic materials which yield age reversals or consistent offsets with other dates from the same horizon (e.g. Turney et al., 2000;Walker et al., 2001). Such studies have shown that the category of sample (e.g. macrofossil), and even the choice of species to be dated, can have a considerable effect on the reliability of ages (e.g. Broecker and Clark, 2011;England et al., 2013) however, their application has been relatively sparse .
For the purposes of constraining ice sheet retreat the majority of 14 C ages within a compilation are unlikely to be part of an inventory. In this case it may be most practical to base quality assurance on the general category of material that has been dated, informed by the results or radiocarbon inventories. Bulk sediment samples have the highest potential for mixing carbon of different ages and from different sources as the sample is likely to have been deposited over a period of time and, potentially, mixed by processes such as bioturbation (e.g. Kershaw, 1986). In the context of constraining deglaciation bulk samples may be particularly prone to problems with widespread re-mobilisation of potentially old carbon due to peri-glacial processes that were widespread around deglaciating margins . 14 C ages obtained from microfossil samples can suffer from the translocation of water soluble humic acids in situ, although chemical pretreatment of the sample material mitigates this. What is harder to quantify and correct are reservoir effects, including the hardwater effect. Additionally, microfossil samples consist of many individual microfossils introducing potential for dating a sample of 'mixed' ages. The resulting 14 C age will be an average and potentially biased by incorporation of 'young' or 'old' fractions. The composition of the sample (e.g. monospecific vis a vis poly-specific) can have a significant effect on how closely this age will reflect the true age of the sediment as determined by other means (Heier-Nielsen et al., 1995). Finally, macrofossils are often considered the optimal material for dating as the influence of reservoir effects can be effectively minimised by good sample selection (Törnqvist et al., 1992;Kitagawa and van der Plicht, 1998). They can, however, still be influenced by reworking and mixing, if the sample contains more than one individual organism.

Luminescence
Luminescence dating directly determines the time of sediment deposition (burial) by determining when a mineral grain (typically quartz or K-feldspar) was last exposed to sunlight (or bleached). Exposure to sunlight prior to deposition (e.g. during transport) releases accumulated charge within light-sensitive traps in the crystal lattice of mineral grains. After burial grains are exposed to ionizing radiation caused by the presence of radioactive elements (e.g. U, Th, K) in the natural environment. This radiation excites electrons that become trapped within crystal imperfections. The concentration of radionuclides and the magnitude of the radiation dose arising from cosmic rays is assessed for each sample and the data used to calculate the magnitude of the radiation dose per year, known as the environmental dose rate. The total dose to which the grains were exposed during burial the grains (the equivalent dose; D e ) can then be determined in the laboratory and divided by the environmental dose rate to determine the time since deposition. The ubiquitous nature of the target minerals (quartz and feldspar) along with the ability to directly date sedimentary deposits has lead to luminescence being widely applied to glacially derived sediments (e.g. Owen et al., 2002;Duller, 2006;Glasser et al., 2006;Pawley et al., 2008;Smedley et al., 2016, in press).
In direct sunlight the optically stimulated luminescence signal from quartz grains can be bleached within a few seconds (Colarossi et al., 2015), however in nature the potential for bleaching is dependent on the transport and depositional pathways of the sampled sediment (Fuchs and Owen, 2008). Aeolian processes are generally considered optimal for luminescence dating as sub-aerial exposure of sediment, a pre-requisite for mobilization by wind, provides ample opportunity for exposure to light (Lancaster, 2008;Roberts, 2008). In such cases the measured D e will accurately reflect the time of deposition of the sediment. In comparison other transport mechanisms such as fluvial and glacio-fluvial processes have reduced potential for complete bleaching (Wallinga, 2002;King et al., 2014). This is due to the turbidity of the water or length of transport that acts to reduce the potential for exposure to light (Wallinga, 2002).
The use of optical stimulation to generate optically stimulated luminescence (OSL) and the development of the single-aliquot regenerative dose (SAR) protocol by Murray and Wintle (2000) have allowed the OSL signal of quartz to be used to provide accurate and precise ages in agreement with independent chronology from a variety of depositional environments (see Roberts, 2008;Murray and Olley, 2002), including glaciofluvial settings (e.g. Duller, 2006). In contrast to quartz, luminescence dating of K-feldspars using infra-red stimulated luminescence (IRSL) has been less widely applied due mostly to the effects of anomalous fading (Wintle, 1973). However, recent improvements (Thomsen et al., 2008) have largely overcome this problem and  show that reliable ages can be obtained from glaciofluvial sediments.
Direct dating of material deposited by ice sheets (i.e. till) can be problematic due to the limited potential for bleaching (e.g. Lukas et al., 2007;Fuchs and Owen, 2008). However, for the purposes of constraining glaciation, OSL can be applied to sediments from a variety of settings that can be linked to former ice margin positions such as ice marginal sediments (e.g. Thomas et al., 2006), glacial lake sediments (e.g. Lepper et al., 2007) and, glaciofluvial outwash (e.g. . Additionally, OSL ages from sediments that are above or below

Obtaining an OSL age
To obtain an OSL age a sample of sediment is taken from the unit of interest. OSL measurements can be undertaken on a variety of grain sizes (e.g.~4-11 μm or~63-300 μm) and minerals (quartz/feldspar). Thus following sampling, the mineral and size fraction of interest are isolated using separation methods (e.g. Wintle, 1997). At all stages, from sampling to measurement, care is taken to shield the sample from exposure to anything other than red-light (to which the OSL signal is insensitive to bleaching). There are numerous approaches to making OSL analyses including variations in aliquot size measured (e.g. large aliquot, small aliquot, single-grain measurements). Grains are mounted on a disc and analysed with an OSL reader (e.g. Risø TL/OSL DA-15). The measured D e value is combined with the environmental dose rate, calculated from measured concentrations of radioactive minerals and external gamma-dosimetry, to derive an OSL age. The measurement of multiple aliquots allows a distribution of D e values to be obtained thus allowing statistical models to improve the accuracy of the D e value to be used in calculating an age. Several models exist including the minimum age model (MAM; Galbraith et al., 1999) and the internal external consistency criterion (IEU) model (Thomsen et al., 2007). The choice of model to be employed depends on the observed scatter in D e values. OSL ages are reported as years before present with ±1σ uncertainties that combine random and systematic uncertainties.
Given the rapid developments in the application of OSL it is likely that a geochronological compilation will contain OSL ages produced by a variety of methods and protocols. With sufficient supporting information users of compilations can, in consultation with OSL specialists, make judgements on which measurements are suitably robust in terms of the analytical procedures adopted.

Sources of geological uncertainty
One potential source of geological uncertainty in OSL dating is incomplete resetting of the OSL signal during transportation and deposition (Duller, 1994(Duller, , 2008; this is commonly referred to as partial bleaching. Individual grains within partially-bleached sediment are likely to have experienced variable periods of sunlight exposure for different lengths of time prior to burial. This will have reset the OSL signal of different grains to different levels, which causes scatter in D e distributions when replicate aliquots are measured, ranging from doses representative of the last deposition cycle, up to larger (inherited) doses from grains that were never exposed to sunlight (i.e. OSL signals in saturation).
The extent of sunlight bleaching in nature is dependent on the transportation and depositional pathways of the sampled sediment (Fuchs and Owen, 2008;Livingstone et al., 2015). In fluvial and glacio-fluvial environments there is reduced potential for complete bleaching of sediment grains prior to burial (Duller, 1994). Factors such as the depth, turbidity and sediment content of the transport medium (water) can all enhance the attenuation of sunlight through the water column (Berger and Luternauer, 1987;Gemmell, 1988aGemmell, , 1988b, which reduces the opportunity for bleaching of the OSL signal prior to burial, in addition to the length and number of cycles of transport. To overcome the uncertainty introduced into luminescence dating by incomplete bleaching smaller aliquot sizes are typically used for OSL analysis (i.e. small multi-grain aliquots or single grains) because standard aliquots contain~2500 grains and average out the effects of variable grain bleaching (Duller, 2008). Where partial bleaching may be an issue large numbers of replicate measurements (at least 50 per sample (Rodnight, 2008)) are used to characterise the distribution. Graphical representation of each distribution can be used to diagnose the presence of partial bleaching (Fig. 7). To determine an accurate age statistical models (e.g. MAM, IEU) can then be used to determine the population of grains in the D e distribution that were completely bleached prior to burial.
Bioturbation can cause post-depositional grain mixing in sediments sampled for OSL dating and manifest in complex D e distributions containing grains that have been moved from underlying (older) and overlying (younger) sediments (e.g. Bateman et al., 2003). Identifying the effects of bioturbation on OSL dating can be challenging, however, samples taken from glaciofluvial settings typically do not experience such issues and existing sedimentary structures within the units sampled can often be used to rule-out the influence of bioturbation.
Although the potential for geological uncertainty to be introduced into luminescence dating can be identified from information provided in the legacy data, it is normally impossible to correct for these effects at a later time, and this has significant implications for the inclusion of legacy luminescence data within geochronological compilations such as BRITICE-CHRONO.

Assessment of the BRITICE database
All known ages relating to the last BIIS were compiled into a database by Hughes et al. (2011;the BRITICE-database v1). The published BRITICE-database v1 was updated with the inclusion of newly published data (BRITICE-database v2) with a census date of 01/01/2013 (cf. Hughes et al., 2016). The v2 database contained a total of 1231 ages (686 14 C, 439 TCN, 106 OSL). The aim here was to build a new, assessed version (BRITICE-database v3) of the legacy database, in which our judgments improve on those in BRITICE-database v1/2 where all ages were taken at face value and used in a reconstruction as being of equal and high-enough reliability (Clark et al., 2012). Where the possibility of geological uncertainty is unacceptably high, the rating (confidence) that is assigned to any age is reduced.
An initial 'age filtering assessment' was carried out to focus assessment on those ages relevant to BRITICE-CHRONO where our focus is on deglaciation of the last BIIS. Implicit in this is the exclusion of later glacial events, namely the Loch Lomond Readvance (LLR). This event is temporally constrained as the local equivalent to Greenland Stadia-1 (GS-1, 12.9-11.7 ka b2k; Rasmussen et al., 2006): thus ages b 13 ka were not considered for the purposes of BRITICE-CHRONO. Similarly, it is possible to place an upper boundary on ages for inclusion. As BRITICE-CHRONO is explicitly focused on retreat from the LGM it is reasonable to exclude ages that predate the maximum extent of the BIIS. Although the precise timing of this varies spatially an absolute maximum age can be assigned based on several lines of evidence. Firstly, offshore IRD suggests expansion of the BIIS into the marine realm after 29 ka (Fig. 8) (Scourse et al., 2009). Secondly ice-free conditions at c.35 ka, in areas proximal to ice nucleation centres, are evidenced by the occurrence of Pleistocene fauna whose remains have been reliably dated using 14 C (Jacobi et al., 2009). Taking these two lines of evidence it is considered reasonable to assume that no sector of the BIIS reached its maximum extent prior to 30 ka. As a result, ages N30 ka are also not considered for further quality assurance. Applying these two age filters significantly reduces the number of ages within the database (Table 1).
To make the best use of the BRITICE-database v2 we have developed an explicit and transparent protocol for the quality assurance carried out. An important principle was that the first assessment of an age or group of ages from any given site should be on the basis of the stratigraphic and geomorphic context and the details of the particular dating method involved. Specifically, after initial age filtering no regard should be given to whether "ages fit hypotheses" about wider regional patterns of retreat. In this way, ages were treated as measurements that were independent of the phenomena that were being investigated (Bronk Ramsey, 1998). One potential way this could have been achieved would have been to employ some form of double blind assessment where ages were multiplied by a random factor before being assessed and without the factor being known. One major difficulty however was the volume of data to be assessed and the need for sufficient stratigraphical and contextual data. Extracting and summarizing this for the entire BRITICE v2 database, while maintaining its anonymity, would have been exceptionally time consuming and was considered impractical. As the aim was to avoid basing assessment on pre-existing hypotheses of regional ice retreat, we did not define criteria that referred to other data from other locations. In doing so we judged it solely by our criteria and not with respect to pre-existing hypotheses. While this may not be as objective as a double-blind procedure we believe it balances objectivity with practicality.
By assessing data in this way we acknowledge that supporting local evidence is important, that is evidence from the same location or from nearby locations that can be reasonably assumed on stratigraphical or geomorphological grounds to have shared the same glaciological history. In effect, we need to define what we mean by a "site". In this context a "site" can be defined as either evidence from the same location or from nearby locations that can be reasonably assumed to have shared the same glaciological history. When considering multiple locations in these terms a "site" could be defined on simple proximity; i.e. any differences in the ages of deglaciation at locations within a few km's may not be resolvable given inherent dating uncertainties. A "site" could also be defined on geological evidence that allows correlation between more disparate sites. For example, in Wester Ross moraines mark the extent of a well-defined, Late Glacial readvance, the Wester Ross Readvance (Robinson and Ballantyne, 1979). These moraines can be traced over many kilometres so locations several km's apart are likely to have been deglaciated at the same time (at least within dating uncertainties), thus they can be considered to be the same feature (or "site"). Similarly in the Firth of Clyde, areally extensive deposits document a marine incursion post deglaciation (Peacock et al., 1977(Peacock et al., , 1978. The similarity of the fauna, sedimentology and their occurrence at a similar altitude imply contemporaneous deposition. Thus it could be argued that samples from these deposits, even when taken from sections several km apart, are dating the same feature ("site").  (Rasmussen et al., 2006). Increases in IRD post date 30 ka and represent advance of the BIIS onto the continental shelf towards its maximum extent. A semi-quantitative "traffic light" system was used for the assessment of ages (Table 2). GREEN signifies ages that are considered reliable and that provide good chronological constraints on a glacially related event at a given site (i.e. deglaciation or a marine incursion). An example would be where there are multiple ages from a site/feature with those ages being consistent with each other. A first principle for an age achieving a GREEN status is that other evidence from the same site supports it.
AMBER denotes ages that are potentially reliable but which lack the weight of supporting evidence to warrant a higher status. An example of this would be where only two (internally consistent) CN ages exist from a site. Although agreement between two ages increases confidence that they are accurate it does not allow us to categorically rule out geological uncertainty. AMBER is also the status given to ages whose accuracy with respect to dating the event of interest remains uncertain due to a lack of good stratigraphical/geomorphological context.
Ages that are assigned a RED status are considered of lower reliability due to the potential for large geological uncertainty. Predominantly these are single samples with no supporting evidence from the same site making assessment of geological uncertainty difficult. In addition, sources of geological uncertainty highlighted in, or evident from, the original publication are justification for a RED status. Examples of this would be where the hard-water effect is a recognised possibility or a bulk sediment sample which possibly comprised multiple carbon sources of different ages.

Guidelines for assessing legacy data
The availability of sufficient supporting information was a first order requirement for ages generated by all techniques to be assessed and assigned a quality assurance rating (cf. Hughes et al., 2016). Sufficient supporting information was needed to allow ages to be recalculated/ recalibrated as well as allowing them to be assessed for potential sources of geological uncertainty. Given the technique dependent sources of geological uncertainty outlined above, we devised the following technique specific guidelines for the consistent quality assurance assessment of the legacy data within the BRITICE-database v2. Applying these guidelines led us to define assessment criteria (Table 3) for assigning the appropriate status (GREEN, AMBER or RED) to the data within the new v3 database.
3.1.1. TCN legacy data 1) All data within a compilation to be calculated using a consistent and appropriate choice of production rate and scaling. Where data is not available for recalculation, ages will be assessed as published. Thus in our dataset 10 Be ages were recalculated (see Section 2.1.1) but due to insufficient data reporting 36 Cl ages were assessed as published. 2) As no assessment of the two primary sources of geological uncertainty (Section 2.1.2) can be made on individual samples, single samples must be treated with extreme caution and were not considered for GREEN or AMBER status.
3) Where multiple samples exist, the consistency of the resultant ages is supported by statistical analysis. In our case we chose the χ 2 R statistic with the criterion for acceptance or rejection of any sample cluster based on Bevington and Robinson (2003). As a minimum requirement for achieving GREEN status a sample cluster must have three ages that agree within their internal (analytical) uncertainties. 4) When considering legacy data our concerns were delimiting the patterns and rates of deglaciation therefore only samples from features directly relatable to ice extent were eligible for the GREEN status. For example moraines or glacially transported boulders and not post glacial rock-fall deposits. 3.1.2. 14 C legacy data 1) Ages should be calibrated using the latest and most appropriate calibration curve and, in the case of marine samples, with an appropriate range of potential marine reservoir corrections. Therefore if insufficient supporting information exists to do this, data are not given further consideration (i.e. not given a rating). 2) Making an assessment about consideration i) (Section 2.2.2) required the sample material and supporting "site" information to be taken in to account. 3) Where data existed as part of a 'radiocarbon inventory' (see Section 2.2.2) the results of this were used to guide assessment. However, as most data within our database did not form part of an Table 2 Definitions of quality assurance criteria.
Quality assurance rating Definition GREEN Ages considered reliable and should be included in analysis. Any conflicts with new data will require to be specifically addressed. AMBER Ages available for inclusion in analysis. Their reliability remains open to re-assessment pending new data.

RED
Ages available for comparison with constructed retreat histories. Inclusion in analysis is dependent on new and supporting evidence.

Excluded
Assessed but judged not to make it into the screened database. This is usually because the data are outwith remit (i.e. age filtered or there is insufficient information to make an assessment).

Table 3
Quality assurance criteria used in assessment of the BRITICE-database v2.

Techniques Criteria
Pre-requisite for all techniques -Sufficient data to allow recalculation/recalibration -Multiple, consistent macrofossil/microfossil samples GREEN -Reservoir concerns addressed.
-Good stratigraphic context with respect to event of interest -Single macrofossil/microfossil sample Radiocarbon AMBER -Stratigraphically consistent bulk samples -Reservoir concerns addressed.
-Good stratigraphic context with respect to event of interest -Single macrofossil/microfossil RED -Single bulk sample -Poor stratigraphic context with respect to event of interest -Multiple (3+) samples from a site GREEN -Acceptable reduced Chi-square statistic -Ages feature directly related to event of interest TCN AMBER -Only 2 internally consistent ages from a site -N2 samples not directly related event of interest RED -Single samples -No internally consistent ages -A sensitivity normalized protocol was used for analysis (e.g. SAR). GREEN -Any potential for partial bleaching has been addressed using small aliquot/single grain measurements.
-Supported by other geochronological data (luminescence or other method) from the same site -Good stratigraphic relationship to event of interest Luminescence AMBER -Potential for partial bleaching but not addressed using small aliquot/single grain measurements -Supported by other geochronological data (luminescence or other method) from the same site -Good stratigraphic relationship to event of interest -Preliminary ages or an experimental protocol was used for analyses. RED -Based on feldspar without addressing the potential for anomalous fading -A single sample with no support from other geochronological data -Insufficient depositional context or details of analyses -Poor stratigraphic relationship with respect to event of interest inventory it was necessary to broadly categorise types of sample material and develop individual criteria for quality assurance of these. We divided sampled material into bulk samples, microfossil samples (multiple individuals) and macrofossil samples (single individuals).

4) Ages can only be properly assessed if sufficient information relating
to stratigraphic context is provided in the original publication. If information is insufficient to ascertain the stratigraphical relationship ages were only eligible for the lowest, RED quality assurance rating. 5) No assumption is made regarding the reliability or precision per se between 14 C measurements derived from AMS or radiometric counting techniques but the nature of the sample material, where known, is considered when deciding whether to assign GREEN, AMBER or RED status to an age. 6) Given the variability in data reporting we chose not to make post hoc use of δ 13 C. However where the original authors identify issues through the δ 13 C value this was taken into consideration.
3.1.3. OSL legacy data 1) Given different luminescence properties, the sampled material, method and aliquot size must be taken into consideration (i.e. quartz vis a vis feldspar, standard aliquot vis a vis single grain), and whether a sensitivity normalized method such as SAR was used.
2) If samples are from a depositional context expected to have been partially-bleached prior to deposition, information must be provided that demonstrates an assessment of partial-bleaching has been made for the age to be considered reliable. Samples without this could at best be considered AMBER. 3) Ages can only be properly assessed if sufficient contextual information relating to stratigraphic context is provided in the original publication. If information is insufficient, ages were only eligible for the lowest RED quality assurance rating.

Quality assurance on the BRITICE-CHRONO database
The results of the quality assurance exercise undertaken on the BRITICE-database v2 database are summarized in Table 4. The overall result is that only 45 sites (23 CN, 16 14 C, 6 OSL) received the highest (GREEN) quality assurance rating and are considered well-dated with respects to constraining a relevant geological event (deglaciation, marine incursion or, for modelling purposes, ice advance). A further 53 sites (19 CN, 31 14 C, 3 OSL) are constrained by ages with the next highest (AMBER) quality assurance rating (Fig. 9). The assessment of data represents a significant reduction in the amount of data considered suitable for synthesis in comparison to that what was utilised in a previous reconstruction (e.g. Clark et al., 2012). The assessed database and associated metadata is available in supplementary data.
The assessed data were imported into ArcGIS v.10.1 and converted into GIS compatible files containing all of the relevant metadata and our quality assurance rating. These data are available as supplementary data in both ArcGIS (.shp) format on request and as Google Earth (.kmz) formats in supplementary data.
Overall the spatial coverage of legacy data that receives either a GREEN or AMBER status is extensive within the terrestrial extent of the former BIIS (Fig. 9). Reliable data from the marine realm are sparse with only four sites being constrained by GREEN or AMBER data. This contrast is to be expected given the restricted availability of marine sampling capabilities for much of the period when legacy data were being collected and the difficulty in obtaining good context with respect to glacial deposits. Additionally, of the three geochronological techniques, only 14 C has been applied to marine samples from the former BIIS and basal marine 14 C ages from the continental shelf must necessarily be considered as minimum deglaciation ages with an unquantifiable but intrinsic geological uncertainty between the timing of deglaciation and the deposition of the dated organic material.
The reduction of 14 C data available is particularly striking. One factor is that many 14 C data exist as parts of dated sequences recording paleoenvironmental change, thus only the basal age is directly relevant for constraining deglaciation. Additionally, only in certain stratigraphic and geomorphic scenarios will organic material be directly relatable to glacial deposits and eligible for the highest quality assurance rating, an example being organic material reworked into till, or organic deposits stratigraphically over-or under-lying till (e.g. McCabe et al., 2007;Ó'Cofaigh and Evans, 2007). We emphasise that other scenarios where basal ages are not directly relatable to glacial deposits, and are thus assigned a lower quality assurance rating, do not make 'bad data'. This is manifest in the many sites that are assigned an AMBER quality assurance rating. We anticipate a significant contribution from AMBER data in providing boundary constraints in future reconstructions and modelling experiments.
TCN can directly date onshore features related to ice margins, such as moraines (e.g. Bradwell et al., 2008;Ballantyne et al., 2009b;Small et al., 2012Small et al., , 2016 glacially transported boulders (e.g. Everest et al., 2013;Fabel et al., 2012), and ice dammed lake shorelines (e.g. Fabel et al., 2010). The TCN technique accounts for the largest number of well dated sites following quality assurance. Although 42 26 Al ages were included in the BRITICE-database v2 all of these were employed as tests for complex exposure histories alongside parallel 10 Be measurements. They thus provide useful information about glaciation styles and erosion but do not offer additional constraints on the timing of deglaciation. In comparison to 14 C and TCN, luminescence ages make up a smaller component of the BRITICE-database v2.

Towards a Bayesian approach to modelling deglaciation
The quality assurance procedures outlined above treat ages individually, within our definition of a site (Section 3), such that they are considered as independent measurements (Bronk Ramsey, 1998). However, in the context of geological reconstructions these ages exist within a spatial and temporal framework. This concept is in part illustrated by the stratigraphic principle of superposition where, barring turbation or tectonics, a lower layer within a stratigraphic sequence cannot be younger than any overlying layer. This allows the sequence of events (geomorphological features or sedimentary units), the 'prior' model' in Bayesian terminology, to be determined independently of the chronological measurements (Buck et al., 1996;Bronk Ramsey, 2008;Chiverrell et al., 2013). This independently constructed relative order of events (prior model) contains a series of also independent age measurements often with overlapping age probability distributions, and provides a basis for using Bayesian age modelling (Buck et al., 1996;Bronk Ramsey, 2008) to assess the conformability of the age measurements and generate a model output of the timing of events within a sequence. The use of Bayesian age modelling (Buck et al., 1996;Bronk Ramsey, 2008) has several advantages; particularly the robust handling of outliers (Bronk Ramsey, 2009a, 2009b) and ability to reduce modelled age uncertainties (Blockley et al., 2007;Chiverrell et al., 2013). It is the intention of the BRITICE-CHRONO project to use Bayesian age modelling to produce glacial chronologies that will subsequently be used to test agreement between data and numerical ice sheet models. This Bayesian age modelling will be informed by the quality assurance protocols outlined in this contribution and we thus outline our approach. In doing so we test a previous application of Bayesian age modelling to the British-Irish Ice Sheet (Chiverrell et al., 2013). When used in the context of constraining glacial chronologies the prior model consists of a sequence of locations arranged in the order they would have been deglaciated. For ice sheets, ice streams, and glaciers this sequence can be determined on glaciological grounds (e.g. deglaciation proceeds from ablation zone to accumulation zone) and geomorphological grounds (e.g. using indicators of past ice flow direction). This a priori knowledge allows sites to be arranged in spatial sequence of ice-marginal retreat (cf. Chiverrell et al., 2013). Additional constraints from relative dating information can be incorporated in the prior model by considering the stratigraphic relationship between ages and the event of interest. So for example, 14 C ages from marine deposits stratigraphically above till provide a minimum age for deglaciation. This limiting age constraint (terminus ante quem in Bayesian terminology) sits within the spatial sequence and informs that a site was deglaciated before a certain time (to be defined by the ages from that particular site). The prior model is generated without reference to any age determinations such that it is independent of the numerical dating information. Chiverrell et al. (2013) used a Bayesian approach to age model retreat of the Irish Sea Ice Stream, one of the largest ice streams to drain the former BIIS. The dating control used in this effort was drawn from the literature with quality assurance applied in a more piecemeal Fig. 9. Map of the legacy data from BRITICE-database v2 that has undergone quality assurance and been assigned a GREEN or AMBER rating. Displayed ice sheet margin at 27 ka is from DATED-1 (Hughes et al., 2016). Background elevation data from gebco.net. Fig. 10. Model specification (OxCal input code) for Irish Sea Ice Stream Bayesian age model. manner. The Bayesian approach allows the identification of outliers by comparing the overlap between the likelihood probability distribution and the modelled posterior probability distribution (Bronk Ramsey, 2009a). Chiverrell et al. (2013) assigned outlier measurements a prior probability of being an outlier (probabilities ranging from 0.1 to 1) thereby reducing or excluding their impact on subsequent model runs. This approach produced a conformable age model for the Irish Sea Ice Stream retreat sequence (Fig. 2 in Chiverrell et al. (2013)) with overall model agreement indices N 98% exceeding the N 60% threshold advocated by Bronk Ramsey (2009a).
As an experiment we re-ran the Bayesian age model of the Irish Sea Ice Stream using the same initial dating control, but with all measurements assigned a probability of being an outlier using our quality control screening. GREEN data were assigned a prior probability of 0.05 (i.e. 1 in 20) and AMBER data 0.2 (i.e. 1 in 5). RED data were assigned a prior probability of 1 for being an outlier. The Bayesian modelling was undertaken in OxCal 4.2 (Bronk Ramsey, 2013) using a uniform phase model and run as an outlier model (Buck et al., 1991;Bronk Ramsey, 2009a). The models were set up to assess for outliers in time (t), which is appropriate given the range of dating techniques incorporated. We used a student's t-distribution to define how the outliers are distributed and a scale of 10 0 -10 4 years (cf. Bronk Ramsey, 2009a). The models make the following assumptions: 1. Deglaciation is a progressive process that cannot occur in two places at precisely the same time.
2. There is a constant retreat rate between dated sites. This is akin to assuming a linear sedimentation rate within a depositional sequence. 3. All ages from a given site are dating the same event. 4. Ages provided by different techniques (i.e. 14 C, CN, OSL) are directly comparable. 5. All radiocarbon calibrations are an accurate conversion of a radiocarbon measurement to a calendar age.
The model uses a uniform prior (Bronk Ramsey, 2009a, 2009b) which makes several assumptions regarding calibration of 14 C ages (Blockley et al., 2007), however given the timespan of our model (~8 ka) and the uncertainties associated with other dating techniques any error introduced by these assumptions is not significant (cf. assumption 5). 14 C measurements on marine fossils received a uniform reservoir correction of 525 years. Additionally, use of the uniform prior is considered appropriate to satisfy assumption 1, that is events (ice marginal limits) can abut but cannot overlap. A linear interpolation between dated sites (assumption 2) is also implicit in the use of the uniform prior. With a large number of ages this assumption can be considered approximately true (Telford et al., 2004). Additionally, the model approach was designed to investigate large scale controls on retreat rates (e.g. bathymetry, trough width) and for this purpose a linear interpolation is appropriate. Assumptions 1 and 2 have the effect that the Bayesian age model calculates time averaged retreat rates between two age groupings but does not incorporate any variation in retreat rates within that interval. Given assumption 3, groupings of ages (phases) are classified as being ages from defined sites which can reasonably be assumed to share a glaciological history (Section 3). Phases are delimited by boundaries as this allows events to abut but not overlap. Additionally, as the dating control available comes from disparate locations the use of boundaries corrects for bias within the OxCal program that can be introduced by major gaps within a sequence (Blockley et al., 2004). Finally, calibrated 14 C ages are not, sensu stricto, directly comparable to CN or OSL ages as they are reported in reference to a fixed datum (1950 AD) whereas CN/OSL ages are reported as years before present (i.e. years before sampling). However, given the uncertainties associated with these techniques compared to 14 C and the timescales being investigated we consider assumption 4 to be valid. Our model specification is shown in Fig. 10.
The results of the age modelling are shown in Fig. 11. The input data produced a conformable sequence with modelled posterior outlier  Table 5. The overall results are broadly similar and produced general agreement between modelled boundary ages for both approaches (Table 6). Notably, the age modelled boundaries are consistently better constrained when using the screened data. In all cases this improvement is ≥ 0.5 ka (Table 6). This demonstrates that, in this particular case, the original results were primarily controlled by the availability of good quality data in key locations.
RED data did not contribute to the overall model result (outlier probability =1) however, they are conformable within the model (i.e. they have age probability distributions that overlap). It is now known that some of the RED data included in this analysis is unsound. For example the 10 Be exposure from Shipman Head (C_date Shipman Head1; McCarroll et al., 2010) is now known to be too old due to a significant muonic contribution to its 10 Be inventory from previous long-term exposure (Smedley et al., submitted). In the case of at least some of the RED data, the fact that they are conformable within the Irish Sea sequence is likely to be purely felicitous. Such erroneous yet conformable data could make a significant contribution to an age model and influence subsequent interpretations and would not be identifiable using Bayesian outlier detection alone. To remove this type of data from analysis requires a manual approach to detecting potentially unreliable data, such as that outlined here. Additionally, it demonstrates that obtaining good quality data from sites is critical to maximising the potential for applying Bayesian age modelling to glacial sequences. This is particularly the case as the relatively large uncertainties of techniques such as CN and OSL (compared to 14 C) mean that there is a larger possibility that some part of the likelihood probability distribution will be conformable. A combination of a manual approach to outlier detection and a model averaging approach that weights data according to how likely they are to be correct (cf. Bronk Ramsey, 2009a) produces a robust procedure for identifying potentially erroneous data and subsequently minimising its influence. A final advantage of the Bayesian approach is that when an age that has not received the highest quality rating has a lower modelled posterior probability of being an outlier than originally assigned then increased confidence can be had that said age is accurate and not affected by significant geological uncertainty.

Conclusions
The process of assessing the legacy BRITICE-CHRONO v2 database emphasises the importance of adequate data reporting for maximising the utility of legacy data for future workers (e.g. Stuiver and Polach, 1977;Balco et al., 2008;Dunai and Stuart, 2009;Frankel et al., 2010;Millard, 2014). While each technique has different specifics the inclusion of sufficient methodological and laboratory information to allow ages to be recalculated/updated as understanding of techniques improves (e.g. CN production rates, 14 C calibration) is important as, where such data is missing and cannot be obtained otherwise, legacy data can become obsolete and important information revealed by it lost. Complete reporting of how samples were processed and all associated measurements made can allow detection of issues such as contamination as techniques develop. In addition to technical information, future workers require sufficient observational information to allow some post hoc assessment of the context of samples. While individual studies are likely to make different judgements as to what observational information is important workers should be mindful that, in the future, their results may be revisited with different aims in mind and the inclusion of as much information as reasonably possible would be advantageous.
Beyond the issue of data reporting, the procedures outlined in this review highlight some general points regarding sampling strategies for studies interested in constraining glacial chronologies. It is clear from the Bayesian age model presented in Section 6 that single ages can be problematic. While they are difficult to assess for geological uncertainty and have always been treated with the most scepticism their inclusion in a prior model allows the Bayesian approach to identify outliers in time however, it cannot identify erroneous but conformable data. While such data may have little effect on large scale reconstructions, on more local/regional scales it could produce deglacial chronologies that do not accurately reflect the timing of deglaciation in certain locations. This could, for example, lead to incorrect estimates of retreat rates. Consequently, where material and resources exist, focus should always be on obtaining multiple ages from a site. Where this is not possible isolated ages can be assessed using technique specific criteria to identify the potential for geological uncertainty before inclusion in any subsequent synthesis. Similarly, the Bayesian age model highlights the importance of having well dated sites (e.g. with a good clustering of ages) as it is these that dominate the age model. Finally, this paper outlines our approach to undertaking quality assurance for dating ice sheet retreat. Future studies will implicitly begin from different starting points both in terms of the number and type of data available and how (or if) that data has been compiled. While the issues regarding geological uncertainty are ubiquitous the choices made in how these issues are addressed with respects to legacy data will be subject to some degree of subjectivity. Consequently, we are outlining our criteria as an example of how a quality assurance process for dating ice sheet retreat can be undertaken and to document a decision making process for a data-set that will be used to inform a substantial body of further work. We hope doing so encourages further consideration of the quality assurance issue for the palaeo-ice sheet community.