The cosmic buildup of dust and metals Accurate abundances from GRB-selected star-forming galaxies at 1.7 < z < 6.3

The chemical enrichment of dust and metals in the interstellar medium of galaxies throughout cosmic time is one of the key driving processes of galaxy evolution. Here we study the evolution of the gas-phase metallicities, dust-to-gas (DTG) ratios, and dust-to-metal (DTM) ratios of 36 star-forming galaxies at 1 . 7 < z < 6 . 3 probed by gamma-ray bursts (GRBs). We compiled all GRB-selected galaxies with intermediate-( R = 7000) to high-resolution ( R > 40000) spectroscopic data, including three new sources, for which at least one refractory (e.g., Fe) and one volatile (e.g., S or Zn) element have been detected at S / N > 3. This is to ensure that accurate abundances and dust depletion patterns can be obtained. We ﬁrst derived the redshift evolution of the dust-corrected, absorption-line-based gas-phase metallicity, [M / H] tot , in these galaxies, for which we determine a linear relation with redshift [M / H] tot ( z ) = ( − 0 . 21 ± 0 . 04) z − (0 . 47 ± 0 . 14). We then examined the DTG and DTM ratios as a function of redshift and through three orders of magnitude in metallicity, quantifying the relative dust abundance both through the direct line-of-sight visual extinction, A V , and the derived depletion level. We used a novel method to derive the DTG and DTM mass ratios for each GRB sightline, summing up the mass of all the depleted elements in the dust phase. We ﬁnd that the DTG and DTM mass ratios are both strongly correlated with the gas-phase metallicity and show a mild evolution with redshift as well. While these results are subject to a variety of caveats related to the physical environments and the narrow pencil-beam sightlines through the interstellar medium probed by the GRBs, they provide strong implications for studies of dust masses that aim to infer the gas and metal content of high-redshift galaxies, and particularly demonstrate the large o ﬀ set from the average Galactic value in the low-metallicity, high-redshift regime.


Introduction
The baryon cycle, which includes processes such as the infall of neutral, pristine gas onto galaxies and their subsequent chemical enrichment with dust and metals, is one of the fundamental drivers of galaxy formation and evolution (Tinsley 1980;Dayal & Ferrara 2018;Maiolino & Mannucci 2019;Péroux & Howk 2020).In particular, dust serves as a catalyst for the production of molecular hydrogen H 2 on the surfaces of its grains (Hollenbach & Salpeter 1971;Black & van Dishoeck 1987), an important prerequisite for star formation.The fraction of dust to the overall gas and metal abundances is governed by the most predominant dust production channels, in addition to the efficiency of grain growth in the interstellar medium (ISM) or potential supernovae dust destruction or condensation scenarios (Draine 2003;Dunne et al. 2003;Mattsson et al. 2012;Dwek 2016;Schneider et al. 2016;De Vis et al. 2021).
Long-duration gamma-ray bursts (GRBs hereafter) offer unique insights into the dust and metal abundances of the ISM in their star-forming host galaxies (Savaglio et al. 2003;Jakobsson et al. 2004;Fynbo et al. 2006;Prochaska et al. 2007).Since most GRBs are associated with the death of massive stars (e.g., Woosley & Bloom 2006), they are linked to active star forma-tion and thereby provide a reliable probe of star-forming galaxies through most of cosmic time (Jakobsson et al. 2006b;Kistler et al. 2009;Robertson & Ellis 2012;Tanvir et al. 2012;Greiner et al. 2015;Perley et al. 2016;Ghirlanda & Salvaterra 2022).Moreover, GRBs are some of the most energetic, brightest cosmological sources known (Gehrels et al. 2009;Malesani et al. 2023), enabling detailed studies of the ISM in their host galaxies based on absorption-line spectroscopy, even out to z ≳ 6 (Kawai et al. 2006;Hartoog et al. 2015;Saccardi et al. 2023).While recent observations of nearby GRBs connected to dynamical merger origins challenge this picture (Rastinejad et al. 2022;Levan et al. 2023) and current evidence seem to point to a potential "metallicity-bias" limiting the production of GRBs in metalrich environments at z ≲ 2 (Levesque et al. 2010;Japelj et al. 2016;Palmerio et al. 2019;Graham et al. 2019;Björnsson 2019), these effects are arguably small in the high-redshift universe.
Here we present new measurements and comprehensive analyses of the metal abundance and dust content of three GRB systems at z > 2, studied through absorption-line spectroscopy of their bright optical/near-infrared afterglow.To complement these measurements, we further compile all GRB afterglows at z ≳ 2 observed with intermediate to high-resolution spectrographs for which similar measurements can be obtained, to provide the most comprehensive study to-date of the metallicity and dust content of GRB-selected star-forming galaxies through cosmic time.
The paper is structured as follows.In Sect. 2 we present the observations of the three new GRBs, and describe the overall sample compilation.In Sect.3, we detail the derivation of the metal abundances, the visual extinction, the dust-corrected metallicities and the DTG and DTM ratios for each GRB.In Sect. 4 we present our results, and quantify the evolutionary trends of these properties with redshift and metallicity.Finally in Sect. 5 we discuss and conclude on our work, with particular emphasis on the implications of our results for galaxy evolution studies at high-redshift.
Throughout the paper we assume the concordance ΛCDM cosmological model with Ω m = 0.315, Ω Λ = 0.685, and H 0 = 67.4km s −1 Mpc −1 (Planck Collaboration et al. 2020).We derive relative abundances of specific elements X and Y using the solar abundances as reference, [X/Y] = log(N X /N Y ) − log(N X /N Y ) ⊙ , assuming the solar chemical abundances from Asplund et al. (2021) based on the recommendations by Lodders et al. (2009).Unless indicated otherwise, all uncertainties are given at 1σ confidence throughout the paper.

Observations and sample compilation
In this work, we present measurements of the metal abundance and dust content in the sightlines of three new GRBs, GRBs 190106A, 190919B, and 191011A, observed as part of the ESO-VLT STARGATE ToO Programme (PI: N. R. Tanvir).Further, we compile all GRB afterglows to-date, matching a few predefined criteria as detailed below.The large majority of the GRB afterglows in this work have been observed with the ESO VLT/X-shooter spectrograph (Vernet et al. 2011) as part of the XS-GRB survey programme (PI: J. Fynbo; Selsing et al. 2019;Bolmer et al. 2019).Our compiled sample also includes eight bursts observed with the higher-resolution spectrographs UVES on the VLT (6 out of 8;Dekker et al. 2000) and ESI on Keck (2 out 8;Sheinis et al. 2002).For this work, we require that the GRB afterglow have been observed with intermediate (R = 7000) to high-resolution (R > 40, 000) spectrographs to ensure the robustness of the metal abundance measurements.We additionally impose that at least the wavelength regions of the redshifted transitions of the refractory element Fe ii and the volatile elements S ii or Zn ii are covered and that the signal-tonoise (S/N) ratio is S/N > 3 per resolution element in the regions surrounding these transitions.This is to further optimize the column density measurements of these transitions, and to ensure that we cover at least one heavily depleted and one volatile element to compute the dust-corrected gas-phase metallicities and the dust depletions in GRB-selected galaxies.190106A and 191011A were initially detected with the Neil Gehrels Swift Observatory (Swift hereafter; Gehrels et al. 2004), as reported by Sonbas et al. (2019) and Laha et al. (2019), respectively.GRB 190919B was detected with INTE-GRAL (Winkler et al. 2003) as reported by Mereghetti et al. (2019).Following the detection of the optical counterpart, we obtained ultraviolet to near-infrared (300-2500 nm) spectroscopy of the GRB afterglows with the X-shooter spectrograph (Vernet et al. 2011) mounted on the European Southern Observatory (ESO) Very Large Telescope (VLT) Unit Telescope 2 (in 2019).The observations were carried out 11 hr (GRB 190106A), 4.87 hr (GRB 190919B), and 23.3 min (GRB 191011A, using the rapid-response mode) after trigger.Each observation covered the ultraviolet to near-infrared simultaneously using the UVB, VIS and NIR arms of the VLT/X-shooter with slit-widths of 1 ′′ .0 (UVB) and 0 ′′ .9 (VIS,NIR) and nominal spectral resolutions of R = λ/∆λ = 5400 (UVB), 8900 (VIS), and 5600 (NIR).The delivered spectral resolution are in most observations superior to the nominal, since the seeing full-width-at-half-maximum (FWHM) is considerably smaller than the slit width (Selsing et al. 2019).

GRBs
The spectroscopic data were reduced and processed following a similar approach as described in Selsing et al. (2019).We use the version v. 3.5.3 of the ESO X-shooter pipeline (Modigliani et al. 2010).The flux-calibrated 1D spectra were moved to the vacuum-heliocentric system in the process and corrected for Galactic extinction along the line-of-sight using the values from Schlafly & Finkbeiner (2011).

Sample compilation
In addition to these bursts, we compile all GRB afterglow measurements from the literature following our criteria outlined above.This includes measurements from the pre-X-shooter era of the GRBs 000926, 030226, 050730, 050820A, 050922C, 071031, 080413A, and 081008 (Savaglio et al. 2003;Shin et al. 2006;Prochaska et al. 2007;Piranomonte et al. 2008;Ledoux et al. 2009;D'Elia et al. 2011;Wiseman et al. 2017;Zafar & Møller 2019).Further, we consider all the GRBs observed as part of the XS-GRB legacy survey.Specifically, we adopt the column density and metallicity measurements from Bolmer et al. (2019), which includes GRB 090809A up to GRB 170202A.Beyond this, the GRB afterglows in our sample were all observed with the VLT/X-shooter as part of the STARGATE programme (PI: N. R. Tanvir).In addition to the three bursts presented above, we further include GRBs 181020A, 190114A, and 210905A.GRBs 181020A and 190114A have already been presented in Heintz et al. (2019a), but we here rederive their basic properties for consistency and homogeneity with the rest of the sample, and additionally include the new metallicity measurements for GRB 210905A from Saccardi et al. (2023).We note that GRB 180325A has been observed with X-shooter as part of the STARGATE programme as well (Zafar et al. 2018a).However, since the relevant metal line transitions are all heavily saturated, hindering robust metallicity and depletion measurements, we exclude this burst from the sample.The full GRB afterglow sample is comprised of 36 bursts, with their physical properties summarized in Table 1.

Metal abundances
To model the absorption lines of the three new GRBs 190106A, 190919B, and 191011A considered here, we use the Python module VoigtFit (Krogager 2018).This code takes the observed spectra as input, convolves the Voigt-profiles to match the delivered spectral resolution, and provides the best-fit column density N and broadening parameter b for each transition separately for each of the identified velocity components.We model and tie b and the velocity structure for all the low-ionization transitions, based on the assumption that they physically trace the bulk of the neutral gas (e.g., Prochaska & Wolfe 1997), that is N FeII = N Fe .This is physically motivated since the ionization potentials of the neutral ions considered here are below that of hydrogen (13.6 eV), such that they will predominantly be in the singly-ionized state in the neutral gas-phase.Further, these absorption-line abundances have been found to not be influenced by photoionization from the GRB prompt emission since they typically probe gas in the ISM on kpc scales away from the GRB progenitor (Vreeswijk et al. 2007;Prochaska et al. 2007Prochaska et al. , 2008;;Ledoux et al. 2009;Heintz et al. 2018).
The absorption-line spectra and the best-fit models are shown in Appendix A. To determine the gas-phase metallicity, [X/H] = log(N X /N H ) − log(N X /N H ) ⊙ , for each burst, we first fit the H i column density based on the broad damped Lyman-α absorption trough.Then, we rely primarily on the volatile elements Zn or S to determine the metal abundances.To determine the overall dust depletion level, quantified via [Zn/Fe] = log(N Zn /N Fe ) − log(N Zn /N Fe ) ⊙ (De Cia et al. 2018, see e.g.), we either derive it directly from the measured abundances or, in the case that Zn is inaccessible, we determine the expected [Zn/Fe] exp following the relations from De Cia et al. (2018) as described below.The derived H i column densities, [X/H] and [Zn/Fe] for each of the bursts in the full sample are summarized in Table 1.

Dust-corrected metallicities
Due to mild or strong dust depletion of volatile and refractory elements, respectively, a significant fraction of the metals will be missing from the observed gas-phase abundances, [X/H] obs .To gauge the actual metal abundance of the GRB host galaxies we therefore need to take into account the metals both in the dustand gas-phase.The model for the expected relative abundances can be expressed as where δ X is the dust depletion of element X (see e.g., De Cia et al. 2016;Konstantopoulou et al. 2022) and A2 X and B2 X are empirically computed linear depletion parameters, here taken from Konstantopoulou et al. (subm., see also De Cia et al. 2016).For all cases, δ X ≤ 0, with more negative values indicating higher depletion levels.For each source, we can thus derive the total, dust-corrected metallicity [M/H] tot and the overall strength of dust depletion, [Zn/Fe] fit , by performing a fit minimizing the difference between the observed relative abundances [X/H] obs and the relative abundances [X/H] exp given by Eq. (1).
To sample the posterior distribution of the best fit parameters, we use the implementation of a Dynamical Nested Sampling algorithm provided by the dynesty package (Skilling 2004;Higson et al. 2018;Speagle 2020).This type of sampling algorithm has the benefits of focused Bayesian posterior estimation as performed by Markov Chain Monte Carlo (MCMC) samplers while retaining the ability to determine marginal likelihoods for model comparison like other Nested Sampling algorithms.We use a uniform prior between 0 < [Zn/Fe] < 1.7 for the depletion strength parameter, where the upper limit is motivated by the strongest levels of depletion in Galactic sightlines presented by Jenkins (2009).For the dust-corrected metallicity, we also use a uniform prior, but allow it to run from [M/H] tot = −3.0 to 1.0 which encompasses all known Milky Way sightlines and high-z GRB absorption systems.As expected, the dust-corrected metallicities are overall higher than the metallicities inferred using Zn, S or Si as tracers.For GRBs 190106A, 190919B, and 191011A, we derive dust-corrected gas-phase metallicities of [M/H] tot = −0.40 ± 0.10, −1.25 ± 0.15, and −0.63 ± 0.08, respectively.The dust-corrected metallicities reported in Bolmer et al. (2019) were computed following a similar approach, and the pre-X-shooter GRB sample were re-analysed to compute dustcorrected metallicities following De Cia et al. (2018).The full sample covers a large range in metallicities of [M/H] tot = −2.3 to 0.2, i.e. 0.5% to 150% solar abundances.

Line-of-sight visual extinction
To determine the total integrated amount of dust in the GRB host-galaxy sightline, we model the extinction of the observed afterglow spectral energy distribution (SED) (e.g., Watson et al. 2006;Schady et al. 2010;Zafar et al. 2011;Greiner et al. 2011;Covino et al. 2013;Zafar et al. 2018b).Since the optical afterglows of GRBs follow an underlying smooth, temporallyvarying power-law (Sari et al. 1998), it is possible to very accurately measure the visual extinction A V and the total-to-selective extinction R V .In contrast, the dust in DLAs in quasar sightlines are more difficult to disentangle due to the potential additional extinction of the background quasar spectrum and the uncertainties in functional shape.Following Heintz et al. (2019b), we adopt from the Swift/XRT repository1 the X-ray spectral slope in photon units, Γ, as derived from the Swift/XRT afterglow spectrum as prior for the intrinsic spectral slope converted to a function of wavelength as F λ = F 0 λ Γ−∆β−3 and allow for the synchrotron cooling spectral break ∆β to take a value of ∆β = 0.0 or 0.5 (Sari et al. 1998).
We model the observed, dust-extinguished afterglow as F obs λ = F λ × 10 −0.4A λ where A λ is the extinction as a function of wavelength.To determine the visual extinction A V for three new GRBs, we assume the average SMC extinction law (as parametrized by Gordon et al. 2003) due to the lack of any evidence for the rare 2175 Å extinction bump or an unusual steep (or flat) reddening curve in this sample (see also Appendix A).This is also in line with past GRB observations (Savaglio & Fall 2004;Perley et al. 2008;Kann et al. 2006Kann et al. , 2010;;Friis et al. 2015;Zafar et al. 2018b;Corre et al. 2018), where the 2175 Å dust bump is only observed in a handful of cases (Zafar et al. 2012(Zafar et al. , 2018a;;Heintz et al. 2019c).More exotic extinction curves have also been seen, either being extremely steep as in the case of Table 1 Overview of the absorption-derived GRB host galaxy ISM properties.GRB  050820A 2.6150 0.27 ± 0.04 21.05 ± 0.10 -0.39 ± 0.10 Zn 0.83 ± 0.05 -0.49 ± 0.10 VLT/UVES+HIRES (2,4,5,6)  050922C 2.1990 0.09 ± 0.03 21.55 ± 0.10 -2.09 ± 0  GRB 140506A (Fynbo et al. 2014;Heintz et al. 2017) or in a few bursts showing more flat, "grey" dust distributions (e.g., Stratta et al. 2004Stratta et al. , 2005;;Perley et al. 2008).However, in the majority of cases, an SMC-like extinction curves appear to be the most prevalent considering GRBs with spectral coverage from X-rays to the ultraviolet and near-infrared (Zafar et al. 2018b).
We normalize the intrinsic afterglow spectrum to the flux level in the NIR arm around the wavelength region of the typical K-band (∼ 2µm).We fix the redshift to z GRB and thereby only fit for A V for each case.We derive A V = 0.83 ± 0.03, < 0.03 (3σ), and 0.48 ± 0.13 mag for GRBs 190106A, 190919B, and 191011A, respectively.The spectra and best-fit extinction curve models are shown in Appendix A and the results are summarized for the full sample in Table 1.
Article number, page 4 of 16 Heintz et al.: Dust and metal abundances of GRB-selected star-forming galaxies at 1.7 < z < 6.3

Dust-corrected metallicity evolution with redshift
In Fig. 1 we present the redshift evolution of the dust-corrected metallicities measured for the 36 GRBs in our sample, spanning z GRB = 1.7 − 6.3.We perform a linear fit of the data, including the errors on [M/H] tot , from which we find [M/H] tot (z) = (−0.21± 0.04)z − (0.47 ± 0.14).This slope is slightly steeper and the intercept at z = 0 higher than inferred previously for GRBs, and implies a significant evolution compared to previous results (Cucchiara et al. 2015).Their work did not consider the dust-corrected metallicities, however, which would explain the offset in the intercept.The evolution of [M/H] tot with redshift is observed to be even steeper for DLAs in quasar sightlines (De Cia et al. 2016, 2018), with a slope of −0.32 ± 0.04 as inferred from the dust-corrected metallicities of their large sample.The observed lower intercept of the quasar-DLA relation (De Cia et al. 2018) is also expected since GRB-selected samples are weighted towards more metal-rich galaxies compared to quasar DLAs due to their lower impact parameters (Fynbo et al. 2008;Arabsalmani et al. 2015).We caution that our inferred metallicity evolution with redshift is largely driven by the high-redshift points at z ≳ 5, which is still sparsely populated and may be subject to a more severe selection bias as they will appear optically "dark" (e.g., Fynbo et al. 2001).
To account for the low-metallicity GRB systems which carry less gas in our analysis, we derive the average H i-weighted metallicities ⟨[M/H]⟩ HI , defined as and shown in Fig. 1.The errorbars on z represent the span on the redshift range and the standard deviation on ⟨[M/H]⟩ HI .We divide the sample into larger redshift bins at higher redshifts, considering points at z = 1.7 − 2.2, z = 2.2 − 2.8, z = 2.8 − 3.5, z = 3.5 − 4.5, and z = 4.5 − 6.3, respectively, to account for the sparser number of sources in our sample at early cosmic epochs.
To put our results into context, we compare our measurements to recent simulations mapping the chemical enrichment and the metal mass density in galaxies across cosmic time.In particular, we adopt the "default model" (DM) and the "modified model (MM)" from the L-Galaxies simulations (Henriques et al. 2020;Yates et al. 2021a), the EAGLE simulations (Crain et al. 2015;Schaye et al. 2015), and the IllustrisTNG-100 (Pillepich et al. 2018;Springel et al. 2018), as compiled and described in detail by Yates et al. (2021b).While all simulations seem to find slopes for the redshift evolution of [M/H] tot (z) in agreement with our measurements, the EAGLE and TNG-100 simulations return a higher normalization, in particular at z ≳ 5.This could be due to an over-production or over-retention of metals inside galaxies in these particular simulations, which could indicate that EAGLE and TNG-100 contain an overabundance of massive galaxies.Overall, the L-Galaxies MM galaxy evolution models seem to best reproduce the data, with near-solar metallicity at z ∼ 0 and reaching [M/H] tot = −2 at z ≳ 5.These lower cosmic metallicities are achieved in L-Galaxies MM through highly efficient removal of metal-rich material from galaxies by supernova-driven galactic winds (see Yates et al. 2021a).We also note the particular metal-poor GRB system, GRB 050730 with [M/H] tot = −2.31± 0.18 at z = 3.969 which neither of the models are able to reproduce and is also substantially offset from the underlying metallicity-evolution probed by the GRB sample.This is most likely related to the selection effects of the highresolution VLT/UVES sample (Ledoux et al. 2009), but overall still imply that very metal-poor galaxies exist at z = 4.These observations thus provide new statistics on galaxy properties and their population scatter, which has to be considered in most recent simulation frameworks.
Further, while GRBs do not have the same biases as emission-selected galaxies and thus provide a more complete census of star-forming galaxies at high-z (Fynbo et al. 2008), they may show an aversion to massive, metal-rich host galaxies at z < 2 (Perley et al. 2013;Schulze et al. 2015;Vergani et al. 2015Vergani et al. , 2017;;Japelj et al. 2016;Palmerio et al. 2019).This would explain the lower intercept at z = 0 of the GRB absorbers compared to simulations.However, this does not explain the offset at higher redshifts (z > 3), where GRBs are found to robustly trace the star-forming galaxy population.We also that note that we observe a substantial scatter in the dust-corrected metallicities for a given redshift in the GRB sample, which is not recovered by any of the simulations.This observed scatter potentially seems to decrease with increasing redshift, though this may simply be due to lack of statistics.
While GRBs provide unique measures of the gas-phase metallicities in the ISM of galaxies out to high redshifts, other recent efforts to characterize the metallicities of galaxies out to and beyond z ≈ 4 have recently been carried out in emission as well (e.g., Sanders et al. 2020;Cullen et al. 2021;Heintz et al. 2022a;Curti et al. 2023).New approaches to derive metallicities based on FIR line features such as [O iii]−88µm detectable by ALMA at the same epoch has also recently been established (e.g., Jones et al. 2020).However, these galaxies are luminosity-selected, and therefore represent only the most massive and metal-rich population of star-forming galaxies at these redshifts.Indeed, Cullen et al. (2021) find oxygen abundances in the range 12 + log (O/H) = 7.7 − 8.4 (i.e.[M/H] ≈ − 0.80 to 0.0) for galaxies at z ≈ 3, which is systematically higher than the average GRB absorption-based metallicity at this redshift (likely related to their high stellar masses, M ⋆ > 10 8.5 M ⊙ ).Similarly, Jones et al. (2020) derive oxygen abundances in the range 12 + log (O/H) = 7.5 − 8.2 (i.e.[M/H] = −1.2 to −0.5) for galaxies at z ≳ 7, representing only the top 15% most metalrich GRB host galaxies at these redshifts.Moreover, metallicity measurements from nebular emission lines, such as those taken by Cullen et al. (2021), are strongly dependent on the strong-line diagnostics used (Kewley & Ellison 2008), and generally only represent the metals in H ii regions, which may be a poor reflection of the abundances in the more diffuse ISM.Further comparing the redshift evolution of luminosity-selected galaxies, we find that the evolution inferred for GRBs is slightly steeper than the slope ∆ log (O/H)/∆z ≈ −0.11 ± 0.02 measured by Sanders et al. (2021) from the MOSDEF galaxy survey, the latter also consistent with the results of Jones et al. ( 2020) from z ≈ 0 − 8.This discrepancy (at 2σ confidence) can potentially be due to the different galaxy luminosity distributions and mass ranges probed with either approach or attributed to the different evolution of the stellar and gas-phase metallicities in galaxies (e.g., Yates et al. 2021b;Fraser-McKelvie et al. 2022) or total integrated vs. lineof-sight effects (e.g., Arabsalmani et al. 2023).

The evolution of the dust-to-metals ratio
One way of inferring the dust-to-metal (DTM) ratio in the GRB sightlines is by using the direct measurements of N HI and [M/H] tot to trace the equivalent metal column density, log N M = log N HI + [M/H] tot , and the visual extinction A V , which traces the total integrated dust column in the line-of-sight.This is presented in Fig. 2. Overall, we observe a substantial scatter with

GRBs [M/H] HI Best-fit
Fig. 1 Dust-corrected metallicity [M/H] tot as a function redshift for the GRB-selected galaxies.The small red data points show individual measurements, and the large red hexagons represent the H i-weighted means with redshift where the errorbars denote the redshift interval and 1σ dispersion, respectively.The best-fit relation [M/H] tot (z) = (−0.21± 0.04)z − (0.47 ± 0.14) is shown as the solid black line, with the dark-and light-shaded gray regions indicating the 1 and 2σ confidence intervals.For comparison, we overplot the average dust-corrected metallicities of MW, LMC, and SMC sightlines and predictions from the compiled set of simulations from Yates et al. (2021b): the "default model" (DM) and the "modified model (MM)" from the L-Galaxies simulations (Henriques et al. 2020;Yates et al. 2021a), the EAGLE simulations (Crain et al. 2015;Schaye et al. 2015), and the IllustrisTNG-100 (Pillepich et al. 2018;Springel et al. 2018).Generally, all observations and simulations seem to find similar slopes of ∆ log (O/H)/∆z ≈ 0.1 − 0.3.However, only the L-Galaxies simulations are able to reproduce the lower average metallicities inferred from the GRB sightlines.respect to a constant DTM ratio (dashed curve), with an average value in the GRB sample of DTM SED = log A V − (log N HI + [M/H] tot ) = 4 × 10 −22 mag cm 2 .This is consistent with previous GRB measurements (e.g., Zafar & Watson 2013;Wiseman et al. 2017;Zafar & Møller 2019), and slightly lower than the DTM measured for the Milky Way, DTM Gal = 4.5 × 10 −22 mag cm 2 (Watson 2011), though still consistent within the uncertainties.Notably, A V does not seem to decrease significantly below log N HI + [M/H] tot < 20.0, which might suggest that a nonnegligible fraction of the dust in the line-of-sight is not associated with the neutral gas-phase and instead might originate in the more ionized medium.
Further, we can infer the mass of the elements X i in the dustphase relative to the total metal mass in the line-of-sight, the DTM mass ratio, based on the total, dust-corrected metallicity and depletion level inferred for each GRB host galaxy.Following the approach described in Konstantopoulou et al. (subm.)we derive where δ X i is the dust depletion of each element X (see also De Cia et al. 2016;Konstantopoulou et al. 2022), W X i the atomic weight, and 10 ([X i /H] ⊙ +[M/H] tot ) represents the total metal column of each element X.It is evident that the dust-corrected metallicity [M/H] tot cancels out such that the DTM mass ratio is independent of the overall metallicity of the system.While only a subset of all the expected metals in the dust-and gas-phase are measured, we calculate the total contribution from each element based on the overall depletion level [Zn/Fe].We derive the depletion for each element X from the empirical relations, δ X = A2 X + B2 X × [Zn/Fe], assuming the empirical depletion coefficients A2 X and B2 X from Konstantopoulou et al. (2022).
In Fig. 3 we show the evolution of the DTM ratio as a function of redshift, both considering the DTM derived from the visual extinction A V and the total metal column density, DTM SED , and the depletion-derived DTM mass .We observe no clear evolution of DTM SED with redshift, with a Pearson correlation coefficient of r = −0.01,and the sample overall shows a large scatter.This is consistent with earlier results using GRB and quasar absorbers to probe DTM SED (Zafar & Watson 2013).The depletion-derived DTM mass shows a mild evolution with redshift consistent with simulations (e.g.Li et al. 2019), with a best-fit DTM = (−0.03± 0.01) × z + 0.35 ± 0.05, and a Pearson r coefficient of r = −0.34.The GRB-selected galaxies further show systematically lower DTM mass than the Milky Way (MW), Lymanbreak galaxies at z ≈ 3 (Shapley et al. 2020), and are on average also more dust deficient than the Small and Large Magellanic Clouds (SMC and LMC) (Konstantopoulou et al., subm.).Previous studies of the DTM ratios of high-redshift galaxies using GRB absorbers reached similar conclusions, though based on a different parametrization of the DTM relative to the Milky Way average (De Cia et al. 2013;Wiseman et al. 2017).
In Fig. 4 we now consider the evolution of the DTM ratio as a function of the total dust-corrected metallicity, again using both the extinction and depletion-derived expressions for the DTM.We observe a large scatter in the relation with DTM SED , but find evidence for a potential mild anti-correlation with increasing metallicity with r = −0.41.On the contrary, we observe a significant correlation with r = 0.68 of DTM mass with increasing metallicity, with a best-fit relation of DTM mass = (0.11 ± 0.03) × [M/H] tot + (0.37 ± 0.04).This suggests that galaxies with metallicities relative to solar of 10% to 1% will have DTM mass ratios that are ≈ 60% to ≈ 30% of the Galactic average.This is in good qualitative agreement with predictions from some simulations (Vijayan et al. 2019;Hou et al. 2019;Graziani et al. 2020).However, GRB hosts do suggest more efficient dust production at low metallicities above z ∼ 2 than is found in simulations such as L-Galaxies MM (see Fig. 4 and Yates et al. in prep.).This is likely due to inefficient grain growth (or other production mechanisms) at high redshift in such simulations, although biases in observational samples could also play a role (see further discussion in Sect.4.4).We also caution that absorbing gas is a mix of clouds with different chemical properties, and this complexity might be difficult to take into account in the simulations.
The discrepancy between the metallicity evolution of DTM SED and DTM mass might originate from the distinct dustphases probed by these two approaches.Dust depletion (and thus DTM mass ) traces the amount of dust in the warm neutral medium in the GRB host galaxy ISM/CGM and is less sensitive to clumps of dense cold gas, in particular if rich in carbonaceous grains (Konstantopoulou et al. subm.).A V on the other hand probes the integrated extinction along the line of sight and is therefore less sensitive to more more diffuse and dust-poor regions in the GRB host galaxy, as well as the presence of large grains that may produce grey extinction.Based on our results, there is evidence for DTM mass to be more tightly linked to the total metallicity of the star-forming host galaxy.The evolution with redshift is thus likely just a consequence of the DTM-[M/H] tot relation and the overall chemical enrichment of star-forming galaxies with redshift.We caution that our result on the tentative anti-correlation of DTM SED with metallicity is likely nonphysical as dust growth in the ISM is generally expected to result in an increasing DTM with metallicity.If no ISM dust growth is considered, such that the dust would purely originate from stellar sources, the DTM should be constant but not decrease with metallicity (Mattsson et al. 2014).

The evolution of the dust-to-gas ratio
The H i column densities and visual extinction, A V , measured directly in the GRB sightlines provide an independent measure of the DTG ratio, DTG SED = A V /N HI , in high-redshift galaxies.In Fig. 5 we show the distribution of A V and N HI observed The symbol notation follows Fig. 3. Median relations at five discrete redshifts are also shown from the L-Galaxies MM simulation (Yates et al. in prep.).We observe a potential anti-correlation of DTM SED with metallicity, but a stronger, positive correlation of DTM mass with increasing metallicity.
for the GRB sample.For comparison, the average DTG SED ratios from specific Galactic sightlines and toward the SMC bar, the mean LMC and the LMC2 supershell from Gordon et al. (2003) are shown as well.We find that the majority of the GRB sightlines probe DTG ratios lower than observed in these local galaxies, and measure an average value (A V /N HI ) GRB = 8.95 × 10 −23 mag cm 2 .This is consistent with previous estimates of high-z GRB (Schady et al. 2010;Zafar et al. 2011) and general quasar DLA (Vladilo et al. 2008;Khare et al. 2012) sightlines.We note that the molecular hydrogen gas fraction is ≈ 5% at maximum in the GRB absorption systems (Bolmer et al. 2019;Heintz et al. 2019a), and is therefore negligible in the derivation of the DTG.
Similar to DTM mass , we can also infer the mass of an element X in the dust-phase relative to the total gas mass, the DTG mass ratio, DTG mass , here based on Eq. 3 as where [M/H] tot is again the total dust-corrected metallicity and Z ⊙ = 0.0139 is the solar metallicity by mass (Asplund et al. 2021).The derived DTG mass ratios span 10 −5 − 2 × 10 −3 as summarized in Table 2.
To explore the cause of the low relative dust to gas content in the GRB sightlines, we first examine the evolution of the DTG as a function of redshift in Fig. 6.Here, we again consider the DTG ratios calculated both from the SED fit of the dust extinction DTG SED and the depletion-derived DTG mass .We observe a tendency for a decreasing DTG as a function of redshift in both parameterizations albeit with a large scatter.This implies that the difference in the SED-and dust depletion-based DTM is likely not due to the differences in how the dust is probed.Notably, the bulk of the GRB-selected galaxies at z > 2 have DTG mass measurements below the MW average, reaching three orders of magnitudes lower at DTG = DTG mass = 3.5 × 10 −6 (GRB 050730) (see also Wiseman et al. 2017).
In Fig. 7 we further examine the evolution of the DTG as a function of metallicity.Here we also plot the DTG ratios mea- from Gordon et al. (2003).GRBs typically probe sightlines with lower A V for a given N HI than the local galaxies.
sured for the MW, SMC, and LMC.We find strong correlations between the DTG ratios inferred both from the dust extinction and from the depletion-derived mass ratio, with best-fit relations assuming DTG mass measured from the depletion level, with a correlation coefficient of r = 0.97.These relations are in good agreement with the measurements of the DTG ratios in the MW, SMC, and LMC, populating the high-metallicity end.Notably, GRB 141028A have a DTG SED ratio of A V /N HI ≳ 5 × 10 −22 mag cm 2 , exceeding the average MW DTG ratio of 4.5 × 10 −22 mag cm 2 (Watson 2011), but at substantially lower metallicities of −1.62 ± 0.28.On the other hand, the depletionderived DTG mass is observed to follow a tight correlation with the dust-corrected metallicity [M/H] tot .This suggests that for some particular sightlines like in GRBs 141028A, the extinction in the line-of-sight caused by dust grains are substantially larger than predicted from the overall depletion strength (see also Savaglio & Fall 2004;Wiseman et al. 2017;Bolmer et al. 2019;Konstantopoulou et al. 2022).This could potentially indicate that a dominant contribution from dust grains probed via the extinction are not recovered in the depletion analysis.Potential causes of the discrepancy are that a substantial amount of dust in this system is either confined in clumps of cold neutral medium, or in intervening systems along the line of sight.
We further compare our relations to the predictions by Li et al. (2019) based on the Simba cosmological hydrodynamic galaxy formation simulation and the L-Galaxies simulations by Yates et al. (in prep.).Although L-Galaxies MM matches the  3.For comparison we mark the DTG ratios of the MW, LMC, and SMC.The GRB sightlines through high-redshift galaxies typically probe lower DTG content than inferred from these local galaxies.
GRB relation well for systems below z ∼ 3, both simulations find steeper slopes for the metallicity evolution of the DTG than observed in the higher-redshift GRB sightlines.This could potentially indicate a more efficient dust production in the lowmetallicity regime probed by these high-redshift sightlines than what is currently prescribed in the simulations, as also indicated previously by the excess DTM mass ratios.

Quantifying the dust bias in our sample
Since this analysis is based on GRB afterglows observed with medium to high resolution spectroscopy, we might be biased against the most metal-or dust-rich sightlines that obscures the afterglow light below the detection threshold (Ledoux et al. 2009).To investigate whether our parent sample is subject to this bias, we compare the A V distribution to that of the more unbiased photometric GRB sample presented by Covino et al. (2013), limited to z > 1.7.Following Heintz et al. (2019a), we normalize the two distributions by the number of bursts at A V < 0.1 mag (assuming that the spectroscopic sample is at least complete to this limit) and then compute the detection probability, f det , of the fraction of GRBs in the spectroscopic sample versus that of the  3.We find strong correlations between the DTG inferred both from the SED (top panel) and from depletion (bottom panel) with the metallicity.In the bottom panel we also show the predictions from the Simba simulation at z ∼ 0 − 6 (blue dotted line; Li et al. 2019) and the L-Galaxies MM simulation at z ∼ 2 − 6 (dashed lines; Yates et al. in prep.).The orange dashed line is the limit in which all metals are incorporated into grains (DTM = 1).unbiased sample at the given range in A V (see Fig. 8).We find that the spectroscopic sample is complete up to A V = 0.3 mag (i.e.f det = 1), but that we are only recovering 25% of the expected GRB sightlines at A V = 0.3 − 1.0 mag ( f det = 0.25).At A V > 1 mag, the spectroscopic sample is only 7% complete and even less at larger visual extinctions.
An additional complication might be introduced via our selection.For instance, since we require a detection of at least a set of metal lines at 3σ this will inherently disfavor low-metallicity systems but less severely potentially dust extinguished sightlines due to the broader spectral continuum range observed.This might partly alleviate the tension of our observations with the simulations in the low-metallicity regimes of the DTG and DTM mass ratios (Figs. 4 and 7), and potentially indicate even steeper correlations with metallicity.
Conservatively, we can thus only conclude that the relations derived here are representative of the moderately extinguished (A V ≲ 0.3 mag) GRB host-galaxy population, and may deviate at larger dust columns.We note, however, that the large majority of the GRB population in complete samples of GRBs has A V < 1 mag, and there is no clear evidence that we are missing the most dust-obscured bursts at z > 2 (Krühler et al. 2011(Krühler et al. , 2012)).Nevertheless, while this potential dust obscuration bias may limit the number of the most dust-and metal-rich GRBs in the spectroscopic sample, the inferred trends of the relative DTM and DTG ratios are likely still valid.The main advantage of using GRBs as cosmic probes, lie in the high-z, low-metallicity regime, which is more difficult and time consuming to probe with direct emission-based surveys.

Summary and Future Outlook
In this work, we have presented the most comprehensive analysis to date of the chemical enrichment and the evolution of dust and metals in the ISM of star-forming galaxies at z = 1.7 − 6.3 hosting GRBs.We have compiled all GRB afterglow spectra observed through more than two decades (years 2000 − 2021) with sufficient spectral resolution (R > 7000) and S/N > 3 per resolution bin to enable robust measurements of the element abundances in the GRB lines of sight.GRBs are particularly efficient and relatively unbiased tracers of dense star-forming regions and thereby provide a unique view into the star-forming ISM properties of high-redshift galaxies in absorption that are otherwise difficult to probe in emission.
We found that the GRB-selected, star-forming galaxies had metallicities, corrected for the abundance of elements in the dustphase, that were on average evolving as a function of redshift following [M/H] tot (z) = (−0.21± 0.04)z − (0.47 ± 0.14).These galaxies revealed a slower gradual metal build-up compared to DLAs in quasar sightlines (De Cia et al. 2018), which traces the neutral gas on larger scales around galaxies.Our observations further exhibited a large scatter in the dust-corrected metallicities at a given redshift, which is not captured in most state-ofthe-art galaxy evolution simulations, although there is overall agreement with the chemical enrichment as a function of cosmic time.The largest observational uncertainty in this relation is reflected by the more sparse population of GRB afterglows detected at the highest redshifts, z ≳ 5.This particular population may also be subject to a more severe selection bias than at lower redshifts since they will appear optically "dark".
Based on our observations, we further derived the redshift and metallicity evolution of the DTG and DTM ratios in GRBselected, star-forming galaxies at z > 2. Previously, the farinfrared emission of galaxies have been used to determine the mass and temperature of the dust in the ISM (e.g., Draine & Li 2007).These dust mass estimates have commonly been used to infer the total gas or ISM mass of high-redshift galaxies (e.g., Magdis et al. 2012;Scoville et al. 2016), but typically assuming average MW dust-to-gas ratios or conversion factors.However, these might be systematically underestimated if the fraction of dust relative to the total gas and metal abundance in the high-redshift, metal-poor regimes of galaxies changes substantially.Indeed, in this work we observed that all the z ≳ 2 GRBselected galaxies probe sightlines with lower DTM and DTG values compared to the Galactic average.In particular, we found that the average DTM mass ratios at z ≈ 2 and z ≈ 6 were 0.3 and 0.15, respectively, 2 − 3× lower than observed in the Milky Way.Similarly, the average gas-to-dust mass ratio is ≈ 150 for the MW, where, for comparison, we observed sight lines reaching gas-to-dust mass ratios of ≈ 10 5 at z ≳ 3 or metallicities [M/H] tot < −1.5 (i.e.3% solar).
The DTM and DTG mass ratios derived here do not rely on any conversion factors, and thus provide an accurate measure of the relative mass fractions of gas, dust, and metals along the GRB line of sight.However, since GRBs only probe narrow pencil-beam sightlines through their host galaxies, the total integrated dust and metal abundances are difficult to determine without knowing the size and morphology of the systems.For example, the distribution of dust, metals and gas within the galaxy ISM may impact the measured DTM and DTG along a single line-of-sight.In spatially resolved observations, the DTG mass ratio has been found to vary as a function of hydrogen surface density (Clark et al. 2023) and metallicity (Solís-Castillo & Albrecht 2020), with variations of over an order of magnitude within a single galaxy.A similar variation is also seen as a function of radius in galaxy evolution simulations out to 4 − 5 R e (Yates et al. in prep.).The DTM mass ratio is similarly observed to vary within galaxies as a function of metallicity (Chiang et al. 2018), irrespective of the α CO conversion factor applied.The average DTM and DTG mass ratios measured along single sightlines in absorption may thus be weighted differently to the average properties probed in emission.For example, the latter method would likely be more biased towards the higher metallicity and higher surface density regions of a galaxy, that have higher DTM and DTG mass ratios.It is also important to consider differences in the gas and dust radial profiles, with the former generally extending out to larger radii by up to 50-100% (Thomas et al. 2004).This could dilute the DTG and DTM measured along GRB sightlines, although this is not supported by simulations which predict systematically lower DTG and DTM mass ratios than measured along GRB sight lines (e.g.Figs. 4   and 7).Some additional caveats in directly mapping the GRB results to simulations and emission-selected galaxy studies also include the physical environments of the GRBs, such as the ionization state and density of the gas.
Despite the expected differences between absorption and emission probes, the GRB absorption approach has proven to be extremely effective in determining the [C i]-to-H 2 conversion factor in high-redshift, low-metallicity galaxies (Heintz & Watson 2020), which is otherwise difficult to constrain (Bolatto et al. 2013), and even provide novel constraints on the H i gas masses of galaxies at z > 2 (Heintz et al. 2021(Heintz et al. , 2022b)).This would suggest that GRBs are able to probe galaxy average properties in the radial direction, even if only along a single sightline.The DTM and DTG mass abundance ratios derived here, therefore like enable more accurate determinations of the total gas or ISM masses of high-redshift galaxies, based on the dust masses inferred from the far-infrared dust continuum emission of independent galaxy samples.
In the near-future, this field is certain to rapidly advance with the new spectroscopic observations of high-redshift galaxies with the JWST.Already now, gas-phase metallicities of galaxies have been measured using temperature-sensitive diagnostics up to z ≈ 9 in emission (Heintz et al. 2022a;Curti et al. 2023;Nakajima et al. 2023;Sanders et al. 2023), and even inferred approximately through strong-line diagnostics at z > 10 (Bunker et al. 2023;Hsiao et al. 2023;Heintz et al. 2023b).The unique synergy between JWST and the Atacama Large Millimetre/submillimetre Array (ALMA) further enables direct DTM and DTG mass ratio measurements of galaxies well into the epoch of reionization at z > 6 (Heintz et al. 2023a).Characterizing the host galaxies of GRBs in emission with similar observations would be the natural next step, solidifying the link between galaxy properties derived in absorption and emission.

Fig. 2 A
Fig.2 AV vs. the equivalent metal column density, log N HI + [M/H] tot , i.e. the dust-to-metals (DTM) ratio.The red symbols show the GRB sample where the triangles denote 1σ upper limits.The dashed and dotted lines represent the average MW ratio and the scatter(Watson 2011).GRB sightlines probe a large range in DTM ratios, with an average around the Galactic mean value.

Fig. 3
Fig. 3 SED-derived dust-to-metals ratio, DTM SED = log A V − (log N HI + [M/H] tot ) (top) and depletion-derived DTM mass ratio (bottom) as a function of redshift.Red dots (measurements) and triangles (1σ upper limits) denote the GRB host-galaxy absorbers.Grey symbols show the equivalent values for the MW, LMC and SMC.The Pearson correlation coefficients r are marked for each data set.We observe no clear evolution of DTM SED with redshift, but a mild evolution of DTM mass , with the best-fit relation shown as the solid black line, with the darkand light-shaded gray regions indicating the 1 and 2σ confidence intervals.
Fig. 4 SED-derived dust-to-metals ratio, DTM SED = log A V − (log N HI + [M/H] tot ) (top) and depletion-derived DTM mass ratio (bottom) as a function of dust-corrected metallicity [M/H] tot .The symbol notation follows Fig.3.Median relations at five discrete redshifts are also shown from the L-Galaxies MM simulation(Yates et al. in prep.).We observe a potential anti-correlation of DTM SED with metallicity, but a stronger, positive correlation of DTM mass with increasing metallicity.

Fig. 5
Fig.5Visual extinction A V vs. H i column density N HI , i.e. the dust-to-gas (DTG) ratio.The symbol notation follows Fig.3.For comparison are overplotted the average DTG ratios from specific sightlines in the Local Group (MW, LMC, LMC2 and SMC) fromGordon et al. (2003).GRBs typically probe sightlines with lower A V for a given N HI than the local galaxies.

Fig. 6
Fig.6SED-derived dust-to-gas ratio, DTG SED = A V /N HI (top) and depletion-derived DTG mass (bottom) as a function of redshift.The symbol notation follows Fig.3.For comparison we mark the DTG ratios of the MW, LMC, and SMC.The GRB sightlines through high-redshift galaxies typically probe lower DTG content than inferred from these local galaxies.
Fig.7SED-derived dust-to-gas ratio, DTG SED = A V /N HI (top) and depletion-derived DTG mass (bottom) as a function of dustcorrected metallicity [M/H] tot .The symbol notation follows Fig.3.We find strong correlations between the DTG inferred both from the SED (top panel) and from depletion (bottom panel) with the metallicity.In the bottom panel we also show the predictions from the Simba simulation at z ∼ 0 − 6 (blue dotted line;Li et al. 2019) and the L-Galaxies MM simulation at z ∼ 2 − 6 (dashed lines;Yates et al. in prep.).The orange dashed line is the limit in which all metals are incorporated into grains (DTM = 1).
Fig. 8 Dust-corrected metallicity as a function of H i column density.The red dots again present the main GRB sample, but here size-coded as a function of the SED-derived A V .The greyshaded regions represent increasing A V , marked for each region, assuming the constant DTM SED ratio derived in this work of log A V − [log N HI + [M/H] tot ] = −21.4.The estimated detection probability for GRBs in each of these A V ranges are marked as well (see Sect. 4.4 for further details).
This Table is composed mainly of GRB afterglow measurements from the literature, in addition to the three new bursts analyzed here:GRBs 190106A, 190919B, and 191011A.Col.(1):GRB names.Col.