Public Release of A-SLOTH: Ancient Stars and Local Observables by Tracing Halos

The semi-analytical model A-SLOTH (Ancient Stars and Local Observables by Tracing Halos) is the first public code that connects the formation of the first stars and galaxies to observables. After several successful projects with this model, we publish the source code and describe the public version in this paper. The model is based on dark matter merger trees that can either be generated based on Extended Press-Schechter theory or that can be imported from dark matter simulations. On top of these merger trees, A-SLOTH applies analytical recipes for baryonic physics to model the formation of both metal-free and metal-poor stars and the transition between them with unprecedented precision and fidelity. A-SLOTH samples individual stars and includes radiative, chemical, and mechanical feedback. It is calibrated based on six observables, such as the optical depth to Thomson scattering, the stellar mass of the Milky Way and its satellite galaxies, the number of extremely-metal poor stars, and the cosmic star formation rate density at high redshift. A-SLOTH has versatile applications with moderate computational requirements. It can be used to constrain the properties of the first stars and high-z galaxies based on local observables, predicts properties of the oldest and most metal-poor stars in the Milky Way, can serve as a subgrid model for larger cosmological simulations, and predicts next-generation observables of the early Universe, such as supernova rates or gravitational wave events.


INTRODUCTION
Corresponding author: Tilman Hartwig hartwig@phys.s.u-tokyo.ac.jp a) https://gitlab.com/thartwig/asloth One of the main goals of astrophysics is to understand the formation of stars and galaxies in order to explain the evolution of the Universe from an almost homogeneous state shortly after the Big Bang until the present day. There are many open questions on small and large scales about the formation of stars and evolution of galaxies. One possible approach to answer these questions is to model the temporal evolution of the Universe from high redshift to z = 0 (e.g., Peebles 1993).
Structures in the Universe formed hierarchically under the influence of gravity. Small overdensities in the early Universe evolved into the first gravitationally bound structures, which then hosted the first stars. These small early structures then merged with other halos to form bigger structures, which eventually became the galaxies that we observe in the local Universe. In general, the evolution of the Universe proceeds from simple to complex, from small to big, from pristine to chemically enriched.
While we have many observations of phenomena in the local Universe, our understanding of the cosmos decreases as we move further back in time. For example, we understand star formation in the Milky Way (MW) sufficiently well to model the mass distribution of newlyformed stars or predict the final fates of nearby stars. However, we have only a vague understanding of the formation of the first generations of stars due to a lack of direct observables (Glover 2005;Bromm & Yoshida 2011;Greif 2015;Klessen 2019).
Numerical simulations and semi-analytical models (SAMs) can bridge this gap. Starting from ab-initio initial conditions, one can model the successive formation of stars and galaxies in time and eventually compare the results of such models with observations of the real Universe. However, numerical simulations have to compromise between mass resolution and effective volume or use a zoom-in approach in which only some part of the volume is simulated with higher resolution. For example, TNG50 Pillepich et al. 2019), the highest resolution cosmological simulation in the IllustrisTNG project, has a mass resolution of m DM = 4.5 × 10 5 M and m baryon = 8.5 × 10 4 M for dark matter (DM) and baryons, respectively. Therefore, the smallest galaxies reliably identified in TNG50 (Engler et al. 2021) have stellar masses of ∼ 6.5 × 10 6 M or more, far larger than the values we expect to find in the first minihalos at high redshift. On the other hand, numerical simulations with high enough resolution to reliably model individual minihalos (e.g. Schauer et al. 2019) typically involve volumes that are too small to allow them to follow the formation of massive galaxies at low redshift.
SAMs are computationally more efficient (see below). Depending on their specific flavor, they approximate galaxies with a few characteristic parameters and components and follow their evolution through time. This makes SAMs computationally very efficient and allows one to focus on a few specific aspects of the model. For example, one can trace certain types of stars from high redshift to the present-day MW.
Moreover, SAMs allow to explore a broad parameter space due to their modest runtime. While large cosmological simulations have runtimes of several million CPU hours on modern supercomputers Yoshikawa et al. 2021), a SAM that covers a similar effective volume can run on an ordinary computer within a few minutes.
This computational efficiency makes SAMs ideally suited to study poorly constrained phenomena in the early Universe that have an observable effect in the local Universe. For example, the formation of the first stars in the Universe (Population III or Pop III) is not well understood. However, it affects the epoch of reionization (Bromm & Yoshida 2011;Dayal & Ferrara 2018), the formation of the first supermassive black holes (SMBHs, Latif & Ferrara 2016;Woods et al. 2019;Inayoshi et al. 2020), and the population of old and metal-poor stars in the MW and its satellites (Frebel & Norris 2015;Ishigaki et al. 2021). A SAM allows us to model the formation of the first stars under different assumptions (e.g. regarding their mass distribution or star formation efficiency) and determine which assumption agrees best with observations, and the models that reproduce successfully known observables can be used to predict new diagnostics. Therefore, SAMs provide a fruitful connection between the high-redshift Universe and local observables.
In the next section, we first present our previous models that lead to the development of a-sloth, then discuss and compare various other SAMs in the literature, and eventually present the benefits of a-sloth. a-sloth ( Fig. 1) was originally developed to model the formation and distribution of possible low-mass Pop III survivors in the MW (Hartwig et al. 2015) and is based on Galform (Parkinson et al. 2008). We then extended a-sloth to be applicable for a variety of questions relating to the high-redshift Universe as well as the oldest stars in the MW. Among the topics we examined were whether the Lyman-α emitter CR7 (Sobral et al. 2015) could be a Pop III galaxy (Hartwig et al. 2016b), surviving metal-free stars in MW satellites (Magg et al. 2018), constraints on the pristine initial mass function (IMF) (Tarumi et al. 2020) and the low-mass end of the satellite mass function of the MW (Chen et al. 2022). Each of these works built upon the previous models and contributed to the development of a-sloth.

A-SLOTH Ancestors
We also implemented a weighting of merger trees based on their mass and predicted the rates of SNe from the first stars (Magg et al. 2016). This weighting allows us to sample halo masses from the halo mass function in order to create a cosmologically representative effective volume from merger trees generated using the Extended Press-Schechter (EPS) formalism (see Section 2.1.1 below). We also applied this updated model to predict the gravitational wave (GW) signals from the remnants of the first stars and showed that the detection rate of Pop III GW events is lower than events from other channels, but that Pop III remnants could be identified in the future by their higher masses (Hartwig et al. 2016a). In an independent study, Liu & Bromm (2021) adapted asloth to study GWs from the remnants of the first stars in nuclear star clusters and found promising detection rates.
Our improved treatment of metallicity allowed us to model the transition from Pop III to Pop II star formation more realistically, and we have implemented a new model for inhomogeneous metal mixing in the first galaxies (Hartwig et al. 2018b;Tarumi et al. 2020;Ishigaki et al. 2021). This allows us to predict the properties of second-generation and extremely metal-poor (EMP) stars.
An updated model of star formation allowed us to sample individual Pop II stars and to simulate the formation of MW satellites (Chen et al. 2022). a-sloth can also be used as a subgrid model to larger cosmological simulations, e.g., to predict the impact of the first generations of stars on the global 21cm signal (Magg et al. 2022b).

Other Semi-Analytical Models
SAMs are popular in astrophysics due to their moderate computational costs. They approximate and simplify physical processes to model the Universe, and each has been developed for a specific purpose which limits their applicability. Here, we summarize SAMs in the literature, discuss their strengths, and later show how a-sloth improves upon several aspects of these existing models. We limit this discussion to only SAMs that model Pop III star formation explicitly. There are further notable SAMs that model the pre-reionization Universe but that do not focus on the formation of the first generations of stars (Volonteri et al. 2003;Barausse 2012;Starkenburg et al. 2013;Agarwal et al. 2016;Pacucci et al. 2018). A comparison of further SAMs is presented in Knebe et al. (2015).
gamete (Salvadori et al. 2007) and its extension gamete/qsodust (Valiante et al. 2016(Valiante et al. , 2018 model early star and galaxy formation based on EPS-generated merger trees. The model is suited for all redshifts. It has an arbitrarily fine mass resolution and accounts for radiative, chemical, and mechanical feedback. To distinguish between Pop III and Pop II star formation, the authors use a criterion based on a critical metallicity Z crit (Bromm & Loeb 2003), i.e., on the total metal and dust content of a halo. They calibrate their model by reproducing global properties of the Milky Way at z=0 and the MDF of Galactic halo stars (Salvadori et al. 2007). The model is also able to reproduce the observed properties of Local Group dwarf galaxies (e.g. Salvadori et al. 2015) and, by modeling the inhomogeneous reionization and metalenrichment, it has been used to study high-z Damped Lyman Alpha absorption systems (Salvadori & Ferrara 2012), dwarf galaxies as reionization sources , and the IMF of the first black hole seeds . The latest improvement of the model also accounts for the random sampling of the Pop III IMF . They are also able to reproduce a sample of quasars at high redshift (Valiante et al. 2014) with a subgrid model for the seeding, growth, and feedback of SMBHs (Valiante et al. 2016(Valiante et al. , 2018Sassano et al. 2021;Trinca et al. 2022).
The gamete code has also been combined with N-body simulations to gain spatial information (Salvadori et al. 2010;Pacucci et al. 2017). Graziani et al. (2017) apply the model gamesh (Graziani et al. 2015) to simulate the formation of a MW-like halo. They follow the baryonic evolution and radiative transfer in a semi-analytic way. This model reproduces the observed galaxy main sequence, the massmetallicity and the Fundamental Plane of metallicity relations in the redshift range z = 0 − 4.
The SAM presented in Visbal et al. (2020) simulates the formation of the first stars based on merger trees from N -body simulations. It has a mass resolution of ∼ 10 5 M and operates in the redshift range z = 6 − 30. The authors include feedback in the form of ionizing ra-diation, Lyman-Werner (LW) photons, and metals based on a computationally efficient fast Fourier transform. They also take into account baryonic streaming and reproduce the galaxy UV luminosity function at high redshift.
delphi (Dayal et al. 2014(Dayal et al. , 2017(Dayal et al. , 2019Chatterjee et al. 2020) is a versatile model used to simulate the formation of galaxies and SMBHs at high redshift. It operates down to z = 4 and allows for different cosmologies (other than ΛCDM). The authors include external radiative feedback and can reproduce various independent observables, such as the UV luminosity function of galaxies and quasars at z = 6, star formation rate density (SFRd), the black hole mass function, and the black hole mass -stellar mass relation.
gamma is a SAM used to probe the impact of nuclear astrophysics uncertainties on galactic chemical evolution and to address the chemical evolution of dwarf galaxies in the early universe (Côté et al. 2017. It uses spatially resolved DM merger trees from N -body simulations and operates down to z ∼ 7. It includes inhomogeneous metal mixing in post-processing to reproduce the metallicity distribution function (MDF) of the Renaissance simulation (O'Shea et al. 2015). gamma traces individual elements and even s-and r-process elements.
startree models the formation of the first and second generation of stars to constrain properties of the first stars based on EMP stars in the MW. It is based on EPS trees and follows individual Pop III stars (Komiya et al. 2009(Komiya et al. , 2010. The model includes LW and ionizing feedback, a stochastic model for the inhomogeneous distribution of metal enrichment inside a proto-galaxy, faint Pop III SNe, and allows the user to trace individual elements. The authors also take into account the possible surface pollution of low-mass Pop III stars from accreted metal-enriched interstellar medium (ISM) (Komiya et al. 2010(Komiya et al. , 2016. The model reproduces the observed mass-metallicity relation for dwarf galaxies and various abundance ratio distributions. It operates in the redshift range z = 0 − 30 and can also be used to model the formation of EMP (Komiya 2011), and CEMP stars (Komiya et al. 2020).
The SAM by Tumlinson (2006) models the chemical evolution of MW-like galaxies to constrain the properties of the first and second generation of stars. The original version is based on EPS-generated merger trees, and an updated model uses merger trees from N -body simulations as input (Tumlinson 2010;Gómez et al. 2014). The authors include chemical, radiative, and mechanical feedback. The model is calibrated to reproduce the MDF, ionization history, relative elemental abundances of EMP stars, the MW stellar mass, and the satellite luminosity function. Trenti & Stiavelli (2009) developed a SAM based on EPS merger trees to model the formation of the first and second generation of stars. The model includes radiative feedback and stochastic metal enrichment. The authors do not constrain their model based on observables but instead vary various input parameters to study the effect of such parameters on the model. Finally, the SAM by Crosby et al. (2013) is based on N -body simulations and allows the user to study cosmic variance. It includes radiative, chemical, and mechanical feedback. The authors constrain their model based on the SFRd at z = 10.3.
1.3. Advantages of a-sloth a-sloth is the first public SAM that models the formation of the first stars and galaxies and connects those to local observables. It is calibrated transparently, which guarantees reproducibility and has several advantages over existing SAMs.
a-sloth uses both EPS-based merger trees and merger trees extracted from N -body simulations as input. We provide instructions to obtain spatially resolved merger trees, or users can input their own data. Currently, a-sloth supports merger trees in the format of consistent-trees (Behroozi et al. 2013b), but the input routine can be modified easily. Other SAMs that use only EPS-generated merger trees (e.g. Salvadori et al. 2007;Komiya et al. 2009;Dayal et al. 2014) lack spatial information for the feedback model. a-sloth is highly efficient and parallelized. We have implemented a tree-based lookup for external feedback and minimized the memory requirements as much as practically possible. A MW-like galaxy can be simulated in just over 1 minute on 4 cores, and a cosmologically representative box with side length 8 Mpc/h can be simulated down to redshift z = 4.5 on 32 cores in only 14 minutes. The SAM by Visbal et al. (2020) evaluates feedback with a computationally efficient fast Fourier transform. Other SAMs do not comment on their runtime efficiency. For more details on the numerical implementation of a-sloth, see Magg et al., in prep. a-sloth is the only SAM that samples and traces individual Pop III and Pop II stars. The stochastic sampling of individual stars and their SNe is essential to correctly model feedback in the first galaxies (Chen et al. 2022) and the tracking of individual Pop II stars allows us to directly compare our results to metal-poor stars in the MW. Only few EPS-based models (Tumlinson 2006;Komiya et al. 2009;de Bennassuti et al. 2017) sample individual Pop III stars. In the nomenclature of a-sloth, we distinguish between Pop III (metal-free) and Pop II (metal-poor) stars. The group of Pop II stars in a-sloth also includes Pop I (about Solar metallicity) stars, but the exact distinction between Pop II and Pop I is not relevant for all current applications of a-sloth.
a-sloth tracks individual elements. In the current version, we track carbon and iron explicitly, which allows us to apply a more sophisticated, observationally motivated criterion to distinguish between Pop III and Pop II star formation (Chiaki et al. 2018). Some SAMs also track individual elements Komiya et al. 2020) or even include dust (Valiante et al. 2014(Valiante et al. , 2016de Bennassuti et al. 2017) while others use a simple metallicity cut to discriminate between Pop III and Pop II stars (Tumlinson 2006). a-sloth is calibrated based on six observables, and our calibration procedure is robust and transparent. SAMs usually have various free parameters that need to be chosen. Depending on the application, one guesstimates or calibrates these input parameters to reproduce specific data. Other SAMs aim to reproduce the MDF (Tumlinson 2006;Komiya et al. 2010), observations of quasars or the UV luminosity function at high redshift (Dayal et al. 2014;Valiante et al. 2018;Visbal et al. 2018;Trinca et al. 2022), properties of the MW and its dwarf galaxies de Bennassuti et al. 2017), or results from simulations ). asloth is calibrated to reproduce observables from both the local and high-redshift Universe simultaneously, and we show that this calibration also reproduces additional constraints that we did not actively aim for, such as the metal-enriched volume fraction, the recovery time distribution from simulations, or the observed stellarmass-to-halo-mass (SMHM) relation.
a-sloth models the recovery time (Jeon et al. 2014) between Pop III and Pop II star formation selfconsistently by following the baryonic contents over time. Most models assume a constant recovery time, which defines the time between the last Pop III SN explosion and the first Pop II star formation in a halo. We solve the differential equations that govern the heating and cooling of gas inside minihalos in order to calculate when there is enough gas re-accreted after the first SNe to trigger second-generation star formation.
a-sloth includes a subgrid recipe for inhomogeneous metal mixing inside the first galaxies, which is crucial to correctly model the metallicity of second-generation stars (Hartwig & Yoshida 2019;Tarumi et al. 2020). Only the SAM of Komiya et al. (2020) applies a similar model that takes into account inhomogeneous mixing of metals. Salvadori & Ferrara (2012) account for incomplete mixing of metals in SN outflows. To mimic the effect of inhomogeneous mixing, Côté et al. (2018) apply a smoothing kernel to their MDF in post-processing in order to match results from simulations.

Input Data
Throughout the code, we assume a flat ΛCDM Universe and use cosmological parameters from Planck Collaboration et al. (2016a). Specifically, we use the Hubble constant H 0 = 68 km s −1 Mpc −1 = 100h km s −1 Mpc −1 , a matter density of Ω m = 0.31, and a baryon mass fraction of Ω b = 0.049.
The primary input to a-sloth is a DM halo merger tree describing the merger history of the DM halos whose stellar and gas content we wish to model. These merger trees can be generated using the EPS formalism or extracted from a cosmological simulation, as described in more detail below.

Extended Press-Schechter Merger Trees
The "Press-Schechter" formalism (Press & Schechter 1974) provides an analytical estimate for the number density of DM halos as a function of their mass and redshift. The comoving number density of halos of mass between M and M + dM is given by where ρ m is the matter density and σ(M ) is the standard deviation of the matter power spectrum smoothed on a mass scale M . The parameter ν c = δ c (z)/σ(M ) is the critical threshold for collapse, where δ c is the overdensity for non-linear collapse with δ c (z = 0) = 1.686. The original derivation of Equation 1 by Press & Schechter (1974) suffered from a normalization problem: the mass function they derived accounted for only half the mass in the Universe, with Equation 1 following only after they multiplied their original expression by a factor of two, a procedure with little justification in their study. Later, Bond et al. (1991) showed that this normalization problem is a consequence of the cloud-in-cloud problem, i.e. the difficulty of correctly counting clouds in hierarchical models where smaller clouds can be embedded within larger ones. They showed that this problem could be avoided by an approach based on excursion sets of the density field. In this approach, the DM density field is smoothed on a series of successively smaller scales R, each of which is associated with a corresponding mass scale M (R). Points in the density field that first exceed the overdensity δ c when smoothed on scale R are then associated with halos of mass M . In the simple case where the smoothing function is a sharp "top-hat" filter in wave-number space, the mass function that one obtains from this approach is identical to Equation 1, i.e. the method avoids the original normalization problem. Lacey & Cole (1993) subsequently extended this idea by showing that the excursion set approach allows one to derive not only the DM halo mass function but also the halo merger rate, as well as specific merger histories for individual halos. In a-sloth, we use the publicly available implementation of this EPS formalism presented by Parkinson et al. (2008), which is based on the code by Cole et al. (2000).
The EPS formalism as used in the original galform code (Cole et al. 2000) systematically under-predicts the mass of the most massive progenitors at higher redshifts. Hence, we use the updated version of the code by Parkinson et al. (2008), which modifies the progenitor mass function with a perturbing function to reproduce the halo merger histories of the Millennium simulation (Springel et al. 2005). The best-fitting perturbing function is given by We have chosen this specific implementation of the merger tree, because it performs best compared to other EPS implementations (Jiang & van den Bosch 2014).

Merger Trees from N-Body Simulations
In addition to statistically generated merger trees, we use merger trees from cosmological simulations. In this study we use merger trees from Ishiyama et al. (2016) and from the Caterpillar project (Griffen et al. 2016(Griffen et al. , 2018. These N -body DM only simulations both have sufficient resolution to resolve minihalos. For both simulations, halos were obtained with Rockstar (Behroozi et al. 2013a) and merger trees were generated with Consistent-Trees (Behroozi et al. 2013b). From the Caterpillar simulations we use merger trees of 30 zoomin simulations of MW-mass galaxies (M halo ∼ 10 12 M ), selected from a (100h −1 Mpc) 3 parent box. The highresolution DM particle mass is 3 × 10 4 M , and the simulations are run to z = 0 with a high-resolution region that extends at least to the virial radius of each halo. This is the largest suite of high-resolution zoom-in simulations of MW-mass that can resolve the formation of low-mass minihalos, providing a good estimate of cosmic variance of z = 0 properties related to the first stars and galaxies. They are ideal for applications related to stellar archaeology. The data from Ishiyama et al. (2016) include merger trees for every halo within an (8 h −1 Mpc) 3 comoving volume down to a redshift of z = 4 with a mass resolution of 5000h −1 M . These merger trees allow us to model a large and well-defined cosmological volume for applications such as ionization histories, cosmic star formation histories or the largescale enrichment of the intergalactic medium (IGM).
Both the Caterpillar trees and the merger trees from Ishiyama et al. (2016) are not one individual tree but rather contain a larger number of separate merger trees. In order to allow for easier processing, we resort the merger trees into a strictly time-ordered manner and save them as a binary file.
For brevity, we refer later to the cosmologically representative merger trees from the box with side length 8 h −1 Mpc as the '8 Mpc box'. For comparison, each set of Caterpillar merger trees is extracted from a volume with an effective side length of (3 -6) h −1 Mpc.
In contrast to the Press-Schechter merger trees, trees from N-Body simulations generally contain subhalos. We prevent accretion of hot gas onto subhalos, because their masses tend to fluctuate significantly during mergers. Otherwise subhalos are treated, followed, and modeled just as normal halos.

Auxiliary Input Data
SAMs do not simulate star formation from first principles, such as hydrodynamic simulations do. Instead, a-sloth relies on analytical formulae and pre-computed input quantities. In this section, we explain and justify our input data and explain how users can modify these assumptions for their own models. In the following equations, x = log 10 (M star / M ), where M star is the zero-age-main-sequence mass of an individual star.
• Stellar lifetime We compute the stellar lifetimes of Pop III stars with the fitting function (Schaerer 2002) log 10 (T III /yr) = 9.785−3.759x+1.413x 2 −0.186x 3 , (6) and the stellar lifetimes of Pop II (1/50 Z ) stars with the fitting function (Stahler & Palla 2005) log 10 (T II /yr) = 10 − 3.68x + 1.17x 2 − 0.12x 3 . (7) These fitting functions were constructed to express the lifetimes of stars in the mass ranges 5−500 M for Pop III stars and 7 − 150 M for Pop II stars. However, the values predicted by Eq. 6 agree well with the individual stellar lifetimes of Pop III stars calculated by Marigo et al. (2001) down to subsolar masses. We therefore use this fitting function down to Pop III stellar masses of 0.7 M . For lower stellar masses, the lifetimes exceed the current age of the Universe and so the precise values are unimportant. For Pop II stars, our adopted fitting function agrees well with data from Ekström et al. (2008) and Stahler & Palla (2005) for stellar masses ranging from sub-solar values up to 200 M . Therefore, we also use the fit of stellar lifetimes in this extended range.

• Ionizing photon production
The time-averaged ionizing photon production rates of Pop III and Pop II stars are computed with the formulae 1 (Schaerer 2002) log 10 (Q III /s −1 ) = 43.61 + 4.9x − 0.83x 2 , (8) and log 10 (Q II /s −1 ) = 27.8 + 30.68x − 14.8x 2 + 2.5x 3 , (9) respectively. These fitting functions are valid in the mass ranges 9 − 500 M for Pop III and 7 − 150 M for Pop II stars. We consider all photons with energies above E γ = 13.6 eV to be ionizing and neglect the ionization of helium.

• Metal yields ejected by supernovae
For the metal yields ejected by supernovae, we use 1 These formulae are for single stars. We do not currently account for the impact of binaries on the ionizing photon production rate. However, we expect that their impact will be relatively small. At early times, when photoionization is the most important form of feedback, the ionizing photon production rate is insensitive to the binary fraction, while at later times, when the sensitivity is much larger (Secunda et al. 2020), supernovae will typically be the most important form of feedback.
tabulated values from Nomoto et al. (2013) and Kobayashi et al. (2006) for Pop III and Pop II stars, respectively. In its current version, a-sloth follows carbon, iron, and the sum of all elements explicitly. The code can be modified easily to track more or less elements.
The implemented SN energy, metal yields, ionizing flux, and stellar lifetimes are illustrated as a function of stellar mass in Fig. 2. We illustrate the quantities in the stellar mass ranges for which they are valid. However, please note that we sample Pop II stars only up to 100 M and Pop III stars only in the mass range M min = 5 M to M max = 210 M in our fiducial model. Users of a-sloth can adapt these mass ranges as desired.
We interpolate values onto our grid of stellar masses for lifetime and ionizing flux. For the metal yields, we use the nearest-neighbor values, i.e. the nearest stellar mass available in the tabulated grid of SN yields, so that we do not create artificially mixed combinations by interpolating supernova yields from different progenitor masses. We assume that Pop II and Pop III stars in the mass range 10 − 40 M explode as corecollapse supernovae (CCSNe), Pop III stars in the mass range 140 − 260 M explode as pair instability supernovae (PISNe) (Heger & Woosley 2002, and that there are no supernova explosions for stellar masses between these two ranges.

Baryon Cycle
Each halo is initially assigned a baryonic mass fraction equivalent to the mean cosmic baryon fraction, i.e., the initial baryonic mass of a halo with virial mass M vir is assumed to be (Ω b /Ω m )M vir prior to any star formation. While baryonic streaming might lower this baryon mass in low-mass halos at high redshift, this effect is not important for star-forming halos in our fiducial model (Naoz et al. 2012). To model the evolution of this baryonic mass, we adopt a model in which the baryonic content associated with a halo is divided into four components: cold gas (M cold ), hot gas (M hot ), stars (M * ), and outflows (M out ). We summarize the nomenclature in Table 1. During the evolution we add hot gas to a halo until which implies that halos accrete additional baryons whenever their DM mass increases by smooth accretion.
Smooth accretion here refers to increases in DM mass and the associated increase in baryonic mass which is not caused by mergers or caused by mergers with un-  Figure 2. Illustration of SN energy, ionizing photon production rate, carbon and iron ejecta mass, and stellar lifetimes that are available as input data in a-sloth. Pop III stars are plotted in red and Pop II stars are plotted in blue. Users can define their own specific mass ranges or use other tabulated input data. resolved halos. During mergers all baryonic mass components from the merging halos are added. If DM is accreted via smooth accretion, the baryonic mass budget of the halo is increased accordingly. The additional mass is gradually added to the hot component of the halo. Note that in our model, hot gas is any gas that is not cold, i.e. it corresponds to the sum of the warm and hot phases of the ISM in the usual three-phase description (McKee & Ostriker 1977;. The outflow mass is the cumulative mass that is unbound from the halo and is never re-accreted.
In addition, we allow star formation to have multiple epochs instead of one star burst in each global timestep ∆t z . To accurately follow star formation within individual halos we introduce sub-cycling. This is necessary, because the global timestep of most available merger trees is larger than the typical lifetime of massive stars, and so multiple epochs of stellar birth and feedback are expected within one global model timestep. At each global timestep, adopted from the time-evolution of the merger-trees, the halo inherits the baryonic properties from all of its progenitors, which are M 0 cold , M 0 hot , M 0 * ,II , and M 0 out . Note that halos at the very first global timestep are initialized with hot gas only due to the accretion shock. Then, in each sub timestep i, the baryonic content is updated using the following equations: where we assume that the time scale of the cooling of hot gas is comparable to the dynamical time scale of the disk, which is composed of stars and cold gas in a central region with a radius of R s . Therefore, t dyn = R s /v dyn , R s = R vir /c dm is the scale radius of a halo (Navarro et al. 1996), and the velocity is the characteristic velocity of the central part of the halo. For the halo's concentration, c dm , we use the fitting functions provided by Correa et al. (2015), which are provided in Appendix A.1. We describe how we determine each baryonic component in the following sections.

Critical Halo Mass for Cooling and Star Formation
To form stars, gas needs to cool down efficiently. Since we cannot follow the chemical evolution of the gas without information on the spatial distribution of the In our fiducial model, we follow Eqs. 9 and 10 from Schauer et al. (2021), which quantify the dependence of M crit on the strength of the LW background and the velocity associated with the large-scale streaming of the baryons relative to the DM, v BC . Combining these equations yields the following expression for M crit : Here, J 21 is the strength of LW background in units of 10 −21 erg s −1 cm −2 Hz −1 sr −1 and v BC is the streaming velocity in units of the root-mean-square value, σ rms . We adopt v BC = 0.8, as this is the most likely strength of the streaming velocity (Schauer et al. 2021), and we assume a redshift-dependent form for the LW background following Greif & Bromm (2006), We assume a global LW background, because the effective volume that we simulate with a-sloth is smaller than the mean free path of LW photons (Ahn et al. 2009). In this public release, we do not include nearfield LW feedback, which could self-regulate Pop III star formation. However, users can easily implement this feature, for example following the model in Hartwig et al. (2016a).
We note that the expression for M crit we use here corresponds to the mass scale at which an average halo of that mass can cool and form stars. The minimum halo mass for cooling, M crit,min is typically a factor of a few smaller (Schauer et al. 2021), but as cooling and star formation only occurs in a few halos in the mass range M crit,min < M < M crit , we consider that M crit is the more appropriate value for our purposes.
If a halo is above the atomic cooling limit (T vir ≥ 10 4 K), the gas can cool efficiently as long as the ionizing background is not strong (Visbal et al. 2017, also see Section 2.5.2). We account for this via another critical mass scale (Hummel et al. 2012) The final critical mass scale is then simply given by

Stochastic Sampling of Stars
Earlier versions of a-sloth considered star formation only in clusters with IMF-averaged and continuous feedback (Hartwig et al. 2015;Magg et al. 2018;Tarumi et al. 2020). We have improved the star formation model by adding a new implementation such that we keep track of individual stars (Chen et al. 2022). We store the numbers, masses, and birth times of stars that formed in each timestep. This novel feature of a-sloth allows us to account for stellar feedback at specific times and to correctly model the stochastic feedback. In addition, we are also able to keep track of those stars which survive until z = 0 and their stellar properties.
First, we determine whether we form Pop III stars or Pop II stars based on the empirically derived critical metallicity of Chiaki et al. (2017 where η * is the star formation efficiency per free-fall time, t i cold,ff = (Gρ i cold ) −1/2 is the free-fall time of cold gas, ρ i cold = M i cold /V cold is the mean cold gas density of the halo, M i cold and V cold are the mass and volume of cold gas, respectively.
Next, we compute the average number of stars in the logarithmically spaced IMF bins. For Pop II stars, we assume a Kroupa IMF and for Pop III stars, we assume an IMF with slope α III = 1. If n j,avg ≤ 10, we determine the number of stars in bin j with Poisson sampling. We deactivate Poisson sampling when n j,avg > 10 to save computational time.

Photoheating
Massive stars with M star ≥ 8 M dominate the production of energetic photons in a stellar population. These photons can ionize hydrogen, leading to the heating of gas. We compute a mass conversion rate for stars at different masses by estimating the mass that is enclosed in the HII region of a massive star. Our calculation assumes that the expansion of the HII region is governed by the Spitzer (1978) solution where R I (t) is the radius of the region at time t, R D is the radius at which the ionization front first undergoes the transition from R-type to D-type expansion, t D is the time at which the front reaches R D and c s is the sound speed of the ionized gas. The latter is given by c s = 11.4 T ion /10 4 K 1/2 km/s, and we adopt a value T ion = 10 4 K for the temperature of the ionized gas. 2 Using the Spitzer solution, we derive 3 the following expression for the instantaneous mass conversion rate: where m H is the mass of a hydrogen atom, and we assume n cold,den = 10 3 cm −3 for the number density of the neutral gas surrounding a newly-formed star. As star formation correlates with gas density (Gao & Solomon 2004), we expect this value to be larger than the mean density of the cold gas. Our choice of value here is motivated by Roman-Duval et al. (2010), but in practiceṀ heat is only weakly sensitive to this choice (Ṁ heat ∝ n β cold,den with −3/7 ≤ β ≤ −1/3). For other physical quantities related to cold gas mass such as the gas binding energy and the cold gas free-fall time, we use the mean cold gas density which is based on the current cold gas mass. We find that the mass conversion rate stays almost constant throughout the stellar lifetime (see Fig. B1 in Chen et al. 2022). To save computational time, we therefore use a time-averaged mass conversion rate, Ṁ heat .
The radius of the ionization front at the time that it undergoes the transition from R-type to D-type is simply the initial value for the Strömgren radius: where α B is the case B recombination coefficient, which for our adopted temperature of 10 4 K is given by α B = 2.6 × 10 −13 cm 3 s −1 (Ferland et al. 1992). The time required for the ionization front to reach R D is given by where τ recomb = (α B n cold,den ) −1 is the recombination time.
In dense molecular gas, stars form in clusters. Star clusters are less efficient in heating up the surrounding regions than isolated stars because their ionizing regions overlap. Therefore, we assume that 90% of massive stars in a halo form in one big cluster at the galactic center, whereas the other 10% of massive stars form in isolation (Chen et al. 2022). We sum up the ionizing photons from the 90% of massive stars and compute one mass conversion rateṀ heat,cl for the star cluster and compute individual Ṁ heat,iso for isolated stars. Therefore, in timestep i, the total mass conversion rate from massive stars is given by where N is the total number of massive stars. For the full derivation and the two extreme cases, see the Appendix of Chen et al. (2022).

Outflows Driven by Supernovae
We compute the gas mass ejected from the halo by supernovae by comparing the binding energy of gas with the total energy transferred to the gas by the supernovae. DM is the dominant source of the gravitational potential, which determines the gas binding energy. The full three-dimensional DM distribution within the halos is not provided in the merger trees, and we therefore assume that all halos follow the NFW profile (Navarro et al. 1996). We compute the gas binding energy by considering contributions from DM, cold and hot gas, and stellar components. The gravitational potential of a halo is dominated by DM, however, we find that the contribution of baryonic content is non-negligible. We assume that the gas surrounding massive stars that die as supernovae has been heated up during their stellar lifetimes. Therefore, supernovae in a-sloth eject the hot gas first. The binding energy of hot gas is computed with the formula: where M vir,peak is the peak virial halo mass, R s is the scale radius of the DM halo (Navarro et al. 1996), R vir is the virial radius of the halo and M i disk is the combined mass of stars and cold gas since we assume they reside in the same central region in the halo (Chen et al. 2022). We use the peak virial halo mass until the current redshift. This definition is more robust against the fluctuating results of the halo finder.
Similarly, the binding energy of cold gas can be described by (25) Complete derivations of both of these binding energies can be found in the Appendix of Chen et al. (2022).
We distribute the sum of SNe energies in this step, E i SNe , to hot gas and cold gas based on their binding energies and masses. The fractions of E i SNe that affect hot gas and cold gas are, The amount of hot gas ejected by supernovae is calculated using the expression δM i out,hot = min Here, γ out is the outflow efficiency introduced by Chen et al. (2022) and defined as where the normalization mass M out,norm and α out are free parameters whose values are calibrated in Section 3. Similarly, the amount of cold gas ejected by supernovae is determined with δM i out,cold = min

Adaptive Timestep
To evolve the baryonic physics in time, we numerically integrate the corresponding differential equations with an adaptive timestep. If chosen correctly, such an adaptive timestep saves computational time, while still allowing for accurate integration of dynamical effects that require a fine time resolution. The adaptive timestep is calculated for each halo at every time. The general idea is that we consider the main characteristic timescales of the system and then choose a timestep that is smaller than all these times. We consider the star formation timescale where M * is the amount of stars already present in this halo,Ṁ * is the current SFR, and we add a small offset mass to prevent this timestep from getting too small, i.e., to gain computational speed. These small offsets here and below are chosen empirically as the largest values that still guarantee a stable time integration under all conditions. We expect roughly one SN per 100 M . So this offset corresponds to the smallest unit of stellar mass whose formation we want to resolve in time. A smaller timestep would mean that we follow the formation of SN progenitor stars with unnecessarily high time resolution.
The accretion time scale is given by where M hot is the amount of hot gas already in the halo, M acc is the accretion rate of new hot gas from the IGM, and we add a small offset mass to prevent this timestep from getting too small. Here, the offset mass is chosen as ∼ 1% of the baryonic mass of a typical minihalo. This means that we allow a-sloth to approximate the accretion of hot gas at the percent level in order to gain computational efficiency. We also use the dynamical time as defined in Eq. 12. We found that the time integration of cold gas can be improved when we use 0.1t dyn as a constraint for the dynamical timestep.
The merger tree is defined based on redshift steps, which is the timescale on which halos merge. To account for this, we also include the time between redshift steps, ∆t z , in our calculation of the adaptive timestep.
The adaptive timestep is then given by t adapt = 0.025 min(t SF , t acc , 0.1t dyn , ∆t z ) + 1 yr. (32) The addition of one year prevents extremely small timesteps in some pathological cases and the prefactor of 2.5% is the largest value that still guarantees a stable and converged evolution of the baryon budget. The smallest of these timescales, which then defines the adaptive timestep depends on redshift, halo mass, and other factors. a-sloth provides routines to visualize the relative importance of these characteristic timescales in a specific setup. Fig. 3 shows how the different gas phases of our model evolve over time with an adaptive timestep. This figure is an example for the main branch of a merger tree leading to the formation of a MW-like galaxy and is therefore not representative of the behavior of all minihalos. It illustrates how different baryonic components are converted into each other in a minihalo in the redshift range z = 27 − 18.

Spatial Feedback
One of the primary advantages of using merger trees from N -body simulations is that the position of halos are known and that sub-halos are resolved. This opens up the possibility of modeling the spatially resolved reionization and enrichment history of the IGM. The method for this was introduced in Magg et al. (2018) and the algorithm was refined in Magg et al. (2022b). For completeness, we describe the implementation in this Section. We provide further technical details on the implementation of spatial feedback in Appendix A.3.

External Enrichment and IGM Metallicity
Each star-forming halo i is assigned an enriched region of radius R en,i . Outflows are launched at the virial radius, i.e., R en,i is initialized to be the virial radius. Here, outflows refer to baryons that are pushed outside the virial radius of a halo.
Following Agarwal et al. (2012), we assume an outflow velocity of v out = 110 km s −1 . We chose this fixed value, as due to the calibrated efficiency factor in Eq. 28, we cannot self-consistently compute the outflow velocity from momentum conservation or energetic consideration. This efficiency factor was required to be able to match the stellar mass to halo mass relation. The mass of these outflows is modeled according to Eqs 27 and 29. The expansion of the outflow is modeled as a pressuredriven snowplow. In this approximation the enriched region expands with constant momentum and slows down as it sweeps up the IGM. The expansion velocity is where ρ b is the mean baryon density of the IGM, i.e., The enriched regions also increase in volume due to cosmic expansion. When two halos with enriched bubbles around them merge, we add their volumes and their matter and metal contents. The ISM metallicity experiences strong variations during the first SN explosions. This is due to the small amount of residual gas and metals that remain in the halo. The new stellar mass δM * (purple) and the ejected metal mass δM metals (olive) are the instantaneous values at the specific timestep during subcycling. The halo mass M halo (green) increases in steps, which illustrates the original redshift steps from the merger tree. The stellar masses (orange, cyan) and outflow mass (pink) are cumulative and therefore monotonically increasing.
A halo k is subject to external enrichment if its center is within the enriched bubble of any halo. We compute the mean IGM metallicity at the position x of a halo as where E is the set of enriching halos, i.e., the set of halos j with a distance to x of less than R en,j . For each enriching bubble the mass density is and the metal density is ρ met,out,j = ρ out,j Z out,j .
The term 3Ω b Ω −1 m M vir,k (4πR 3 en,j ) −1 j∈E in Eq. 35 takes into account that the enriching bubbles have to encompass the halo that is being enriched. As we base the accretion of metals on the metallicity of the IGM, it is not limited by the mass of metals contained within the bubbles. Thus, without this term, a very small but highmetallicity bubble could lead to the accretion of more metals than it contains. With this correction term, the accretion shows the correct behavior: in the limit that the accretion is dominated by very massive bubbles and in the limit that the mass budget is dominated by the accreting halo, the accreted metal mass is limited to the available amount of metals. This method of correcting is necessary, as our algorithm currently does not allow to remove metals from the bubbles.

Photoionization of the IGM
In addition to enriching the IGM with metals, massive stars will also photoionize the surrounding IGM. We assign each halo an ionized region with radius R ion . Again, this radius is initialized to be the virial radius. For convenience, we phrase the expansion of each ionized region in terms of its volume V and afterwards compute the corresponding radius as R ion = (3V /4π) 1/3 . The change in ionized volume can be computed by modeling the expansion as an R-type ionization front, i.e., modeling the expansion assuming that the difference between the number of recombinations and the number of produced ionizing photons lead to expansion or shrinking of the ionized region. If we assume the gas inside the ionized region is fully singly ionized we find, for a nucleon number density of n and an ionizing photon emission rateṄ ion . Helium reionization is neglected in this model. We assume a clumping factor of C = 3 (Robertson et al. 2013). To ensure numerical stability, we integrate this equation with an implicit Euler method, and therefore we update the volume of ionized regions during a timestep ∆t z from k to k + 1 as Since the temperature of ionized gas in the IGM is around T ≈ 10 4 K, halos with a virial temperature lower than this cannot gravitationally bind the ionized gas. Thus halos below the atomic cooling limit (i.e., T vir = 10 4 K) are not allowed to accrete gas from the IGM. We extend this to masses up to a factor of 10 above the atomic-cooling limit (Visbal et al. 2017) for halos that are exposed to large fluxes of ionizing photons, F ion > 6.7 × 10 5 photons s −1 cm −2 . If the center of one bubble lies within another they are joined in a new bubble, where the center of the new bubble is the center of mass of the previous two bubbles and the volumes are added. This method prevents a cluster of strong ionizing sources from ionizing the same volume multiple times, and instead creates one big joined ionized region. Similar to the enriched volumes, when two halos with ionized bubbles merge, we add the volume and increase it according to cosmic expansion. We do not use a global, external ionizing background, but calculate the reionization self-consistently within our merger trees. Hence, we implicitly assume that the ionizing radiation feedback in our simulations is dominated by the local sources. The mean free path of ionizing photons in a neutral IGM at the redshift of reionization (z ∼ 7) is only 3 comoving kpc, so this is a good approximation at early times, while the size of most ionized bubbles remains small. It breaks down once connected ionized regions with sizes comparable to the size of our simulation volume start to become common, as at this point there is a substantial likelihood of such a region expanding into our volume from the outside. Hydrodynamical simulations of reionization in large volumes (e.g. Gnedin & Kaurov 2014;Rahmati & Schaye 2018;Kannan et al. 2022) suggest that this happens once the ionized gas volume filling fraction exceeds ∼ 50%, and so it is likely that a-sloth slightly underestimates the time taken to transition from this ionized fraction to a fully ionized state. In order to do a better job of modelling this late stage of reionization, it would be necessary to use merger trees drawn from a simulation of a much larger volume -for instance, Iliev et al. (2014) argue that a box size of at least 100h −1 Mpc is needed to obtain a fully converged reionization history. Unfortunately, we are not aware of a publicly available N -body simulation that is larger than the one that we use from Ishiyama et al. (2016) but that also has the mass resolution and redshift coverage that we require for a-sloth.
Finally, we note that we do not yet include the contribution from AGNs to the reionization model. This should be a good approximation as most studies find their contribution to the ionizing budget prior to reionization to be negligible (see e.g. Trebitsch et al. 2021).

Metallicities and Mixing
Stars synthesize metals and contribute to chemical enrichment. We assume progenitor-mass specific SN yields for Pop II and Pop III stars separately, and calculate the mass of each element by multiplying the yields with the number of exploding SN. The synthesized metals can escape from the halo together with the outflow bubbles. We assume that the fraction of metals ejected from the galaxy is This implementation might underestimate the metallicity of the outflow and hence of the IGM, because SNdriven outflows are usually more metal-rich than the average gas in the galaxy (Mac Low & Ferrara 1999;Emerick et al. 2018;Pandya et al. 2021). However, our simple approximation is able to reproduce the relative number of EMP stars observed in the MW, as we show below.
It is also possible that the metallicity of star-forming gas in a galaxy could be significantly different from the mean gas-phase metallicity of the galaxy (Xu et al. 2016b;Magg et al. 2022a). Tarumi et al. (2020) investigate this scenario by analyzing 3D hydrodynamic simulations and find that star-forming gas in halos that have not experienced star formation (purely externally enriched halos) has significantly lower metallicity than the mean gas metallicity. Metals that are accreted onto the halo can enrich the gas at the outskirts, but do not enrich the star-forming gas close to the center of the halo. To account for this, in a-sloth, we define a metallicity shift factor dZ as where Z sf is the metallicity of the star-forming gas and Z ave is the average metallicity of the halo. The distribution of dZ is different between internal enrichment and external enrichment (Tarumi et al. 2020). For computational efficiency, we define a metal-enriched halo as externally enriched if the total metal mass from SNe inside this halo is < 10 −15 M (effectively zero). For internal enrichment, inhomogeneity is not significant. The distribution of dZ in this case is a Gaussian function with (µ, σ) = (−0.03, 0.15). In the case of external enrichment, dZ is typically significantly negative. We use a fitting function that we have derived in our previous study (Tarumi et al. 2020) and that is provided in Appendix A.4. We apply this metallicity correction to the metallicity of the newly formed stars.
To illustrate the spatial feedback, we show a projection of the 8 Mpc box in Fig. 4. This figure visualizes the spatial extent of radiative and chemical feedback in our cosmologically representative 8 Mpc box. While this figure is a qualitative visualization to demonstrate the available information inside a-sloth, we quantify the evolution of volume filling fractions over time below.

Stochastic Feedback
In some situations, it is not possible or desirable to follow the detailed spatial feedback. Either one wants to save memory and runtime, or one is using EPSgenerated merger trees that do not have the information on the spatial positions of the halos that our spatial feedback implementation requires. For such cases, we have also implemented and tested an option for stochastic feedback that does not rely on spatial information but that still provides reasonable results.
The main idea is that we follow the volume filling fractions of metal enrichment and of ionizing radiation and then determine stochastically if a halo is currently in an ionized or metal-enriched region. Once a halo is in such a region, we assume that all its children are so as well. One could construct rare pathological cases in which this assumption does not hold, but we have found that this simplifying assumption adequately reproduces the desired behavior.
The bubbles around each halo of the ionized and metal-enriched volume expand as described above. At each check, the probability that a halo is affected by external feedback is equal to the respective volume filling fraction at that time. Therefore, we only check for external feedback once, when the halo first surpasses M crit . Otherwise, if we would check for external feedback every timestep and make the timesteps arbitrarily short, the effective probability for external feedback would artificially increase because we would check more often. We show below that checking for external enrichment once provides the correct results. Moreover, we do not add metal masses from external metal enrichment in the case of stochastic feedback; we only flag the halo as externally enriched in such a case.
For any application for which metal enrichment and radiative feedback are important, we recommend users of a-sloth to use spatially resolved merger trees from N -Body simulations instead of EPS-generated merger trees. We still provide this mode of a-sloth, because it is a very efficient way to explore trends with input parameters, and it allows new users to immediately run simulations, without the need to download several GB of merger tree data.

CALIBRATION
In this section, we first introduce the six observables, based on which we calibrate a-sloth. Then, we discuss our calibration strategy and how we found an optimal fit for the input parameters. We show how this calibrated model reproduces other constraints that a-sloth was not incentivized to fit. Finally, we discuss how the model performs with EPS-generated merger trees.

Six Observables
We use six different observables to calibrate the input parameters of a-sloth: the optical depth to Thomson scattering, the stellar mass of the MW, the cumulative distribution of stellar masses of the MW satellite galaxies, the fraction of EMP stars in the MW halo, the ratio of ultra metal-poor (UMP, [Fe/H] < −4) to EMP ([Fe/H] < −3) stars, and the cosmic SFRd at high redshift. For each of these observables, we calculate a goodness-of-fit parameter, p i , which quantifies how consistent the output of a-sloth is with observations. These parameters can have values in the range 0 ≤ p i ≤ 1, and higher values represent better fits. Some of these values are inspired by p-values (see below), for which a value of p i > 0.05 conventionally means that a model is accepted (i.e. can not be rejected).
Eventually, we multiply these six individual parameters to obtain our final goodness-of-fit measure, p all . Maximizing p all during the calibration guarantees that all six observables are reproduced simultaneously in the final model. In App. B.1, we present how the six observables depend on the input parameters. In Table 2 we present the best-fit values, which will be discussed in more detail below.

Ionization History
A key observable from the early Universe is the Thomson scattering optical depth τ e . Free electrons can scatter CMB photons, leading to a slight polarization of the CMB. As only free electrons take part in this process, τ e is influenced by the ionization fraction of the IGM as a function of time, i.e., the ionization history of the Universe. The τ e parameter is computed as where z is the redshift, σ T = 0.665 × 10 −24 cm 2 is the Thomson scattering cross-section, n H the cosmological mean density of hydrogen nuclei at z = 0, Q ion the volume filling fraction of ionized regions, and f e the number of free electrons per hydrogen nucleus in the ionized IGM. The presence of helium makes this number slightly larger than one and we assume for simplicity that where Y p and X p are the primordial abundances of He and H, respectively (Robertson et al. 2013). This prescription assumes that He + is produced at the same time as H + , primarily via radiation from stellar sources, and that the helium is later ionized further to He ++ , at z = 4, as a consequence of the steadily increasing contribution of high-redshift AGN to the extragalactic UV background. In practice, τ e is only weakly sensitive to the redshift at which this occurs. The time evolution of the volume filling fraction is based on Eq. 39.   Komatsu et al. 2011), which implies much less star formation at the highest redshifts (Visbal et al. 2015). The instantaneous ionization redshift corresponding to this optical depth is z = 8.3. We can compute τ e from the ionization fraction of the Universe that we determine based on the total volumes of the ionized regions, as described in Section 2.5.2. In Fig. 5, we show the ionization history derived from our 8 Mpc box as compared to various observations and a selection of representative models from the literature (Robertson et al. 2015;Graziani et al. 2015;de Bennassuti et al. 2017). The observational constraints are based on various different techniques (McGreer et al. 2015;Mesinger et al. 2015;Planck Collaboration et al. 2016b;Greig & Mesinger 2017;Zheng et al. 2017;Ouchi et al. 2018;Mason et al. 2018;Bañados et al. 2018;Konno et al. 2018). The models by Graziani et al. (2015) andde Bennassuti et al. (2017) are derived for a MW-like volume, and are hence not directly comparable to our cos- mologically representative box. We calibrate a-sloth only based on the optical depth to Thomson scattering, i.e., the integrated reionization history. Our ionization history (blue line) agrees well with observational constraints on the volume filling factor of ionized gas at high redshift (the data points in the figure), which confirms the consistency of a-sloth.

Stellar Mass of MW and Its Satellites
We calibrate a-sloth to reproduce the observed MW stellar mass, and the number counts of satellites at z = 0. To do this, we use the merger trees from the Caterpillar project simulations, each of which was designed to follow the build-up of a MW-like galaxy. The mean stellar mass we obtain with our fiducial a-sloth model for these MW analogs is M star,MW = (7.3±3.0)×10 10 M . For comparison, the observed stellar mass of the real MW is M obs star,MW = 5.43 ± 0.57 × 10 10 M (McMillan 2017).
In Fig. 6, we show the cumulative stellar mass function (CSMF) of satellites from the 30 Caterpillar trees, computed using our fiducial model. Each gray curve represents the CSMF from one tree, and for comparison the blue curve shows the observed CSMF from Mc-Connachie (2012) and Muñoz et al. (2018). Due to the limit of current surveys, the search for MW satellites is incomplete below M * ∼ 10 5 M . We shade the region below this mass in gray and only calibrate our results to the observed CSMF above this observational completeness limit. We note that these compilations of satellite galaxies do not yet include AntII and CraII, which both have stellar masses of ∼ 10 6 M (Ji et al. 2021), but including these two galaxies would not significantly change our results.
To quantify the difference between our predicted CSMFs and the observed one, we use a two-sample Kolmogorov-Smirnov test. We perform the tests on each CSMF produced by a-sloth and the observed CSMF. From each test, we retrieve a p-value, which we use as a goodness-of-fit parameter. We then compute the average of these parameters from the 30 tests. . CSMFs at z = 0 extracted from our set of 30 Caterpillar trees, processed using our fiducial model parameters. The CSMFs from our Caterpillar trees are plotted in gray and the observed CSMF for the MW is plotted in blue. The gray shaded area represents the region in which the observations are known to be incomplete. The observed CSMF in this region is plotted using a dashed line and is not used to calibrate the model. We note that, by construction, we neglect the effects of tidal stripping. While this effect is important for at least some of the nearby satellites (Simon 2018), there are several reasons why it cannot be accurately modeled in our current framework. Firstly, both the shape of the potential well of the MW and of the satellites are inaccurate without accounting for the baryonic component and without the presence of the MW disk. Thus the tidal stripping cannot be accurately modeled in the dark-matter-only simulations that we base our model on Bullock & Boylan-Kolchin (2017). Secondly, halo finders often struggle to correctly identify subhalos and their masses. Indeed, in many cases we see a substantial increase in the mass of the subhalo during tidal stripping events, which is caused by the halo finders associating some of the mass of the main halo to the subhalo. Furthermore, the effect of tidal stripping is more significant on satellites that are in proximity to the MW. Thus we neglect tidal stripping in the current work and leave it to future versions, ideally built on merger trees with disk potentials to improve on this aspect.

Extremely Metal-Poor Stars
We use two metallicity-related values for the calibration. These are the fraction of stars in the MW that are EMP stars, f EMP , and the ratio between the fraction of UMP stars and the fraction of EMP stars, f UMP /f EMP . We focus on these two individual values rather than comparing with the full MDF because it is easier to combine the values obtained from several different surveys (see below) and because it gives us a straightforward way of quantifying our agreement with the observations. We assume that EMP stars in a-sloth are predominantly found in the halo (Kobayashi et al. 2011). Also, Arentsen et al. (2020) did not find any EMP stars among 8000 spectroscopically observed bulge stars. However, Sestito et al. (2019) found several EMP stars with disklike orbits.
For the EMP fraction, we assume that (i) the stellar mass in the MW is 6×10 10 M (McMillan 2017); (ii) the stellar mass in the halo of the MW is 1×10 9 M (Bullock & Johnston 2005); and (iii) 1 out of 800 halo stars is an EMP star (Youakim et al. 2020). For the UMP to EMP ratio, we assume three possible values based on the literature. Youakim et al. (2020) find in the Pristine survey that 10 −2 of EMP stars are UMP stars. Chiti et al. (2021) argue that the MDF dN/dZ can be wellfitted with a power-law function Z −1.5 , implying that f UMP /f EMP = 10 −1.5 . Finally, Bonifacio et al. (2021) use Gaia and SDSS photometry in the TOPoS survey to derive a UMP to EMP ratio of 8/217 10 −1.4 .
To evaluate the goodness-of-fit, we calculate how significantly the two metrics (f EMP and f UMP /f EMP ) are away from these observations. We outline here the pro-cedure we use in the case of f EMP but note that the treatment is similar for f UMP /f EMP . For the 30 Caterpillar merger trees, we first calculate the mean value of the logarithm of f EMP , µ CTP = mean[log 10 (f EMP )], and its standard deviation σ CTP = stddev[log 10 (f EMP )]. Then we calculate the "error" from an observation i as f i = |µ CTP − log 10 [f EMP,i ]|/σ. Here i stands for the three observations that we compare to, and σ = (0.5) 2 + σ 2 CTP is the error estimate that includes 0.5 dex observational uncertainty. The main source of this large uncertainty is the stellar mass estimate in MW halo that ranges from 4 × 10 8 M (Bell et al. 2008) to 10 9 M (Bullock & Johnston 2005). Then we use min(f Pristine , f Chiti , f TOPoS ) as the goodness-of-fit evaluator.
Our approach predicts a mean EMP fraction of log 10 [f EMP ] = −4.39 together with a mean EMP to UMP fraction of log 10 [f EMP /f UMP ] = −1.23. The value of log 10 (f EMP ) that we obtain lies 0.40σ away from the observed value. The value of log 10 (f UMP /f EMP ) lies 0.58σ away from the Bonifacio et al. (2021) value and around one sigma away from the Chiti et al. (2021) value. We consider these to be good fits, which provide the second and third highest goodness-of-fit parameters out of the six observables.

Cosmic Star Formation Rate Density
The cosmic SFRd quantifies how many stars form per unit volume per unit time in the Universe. a-sloth can predict the SFRd at high redshifts and surveys can estimate the SFRd out to z ∼ 10. Since the SFRd is a cosmic average, we use the 8 Mpc box as a cosmologically representative sample of the Universe. We only have DM halo merger trees for the 8 Mpc box down to redshift z ∼ 5. So for this observable, we compare the SFRd from asloth to observations in the redshift range 5 ≤ z ≤ 10.
The SFRd cannot be observed directly. Instead, galaxy counts at high redshift need to be de-biased and the galaxy mass function needs to be extrapolated to the faint end at different redshifts. Hence, the observationbased SFRd is not well constrained at high redshift. Therefore, we will compare our model with two extreme cases. Specifically, we compare our modeled SFRd to the predictions by Madau & Dickinson (2014) and Behroozi et al. (2019). These two estimates differ by up to 1 dex at high redshift. We require that a-sloth lies between these observations or penalize the final goodness-of-fit parameter if our prediction is above both or below both observations.
We compare the maximum difference between our model and observations by using the cumulative stellar mass density that has formed up to a certain redshift.
To simplify the interpretation, we calculate the cumulative stellar mass density as function of cosmic time (and not redshift), so that the resulting unit is stellar mass per unit volume. We compare these cumulative stellar mass densities in the range 300 Myr -1300 Myr after the Big Bang, which is given by the highest redshift for which we have observational constraints and the lowest redshift of the DM merger tree. Fig. 7 shows the two observation-based constraints and our modeled prediction. At higher redshift, the . Best a-sloth model (black) compared to two independent observation-based constraints. We incentivize the result to lie between these two constraints and penalize the maximum distant that our modeled cumulative SFRd is outside the green band.
SFRd of a-sloth is below the prediction of both Madau & Dickinson (2014) and Behroozi et al. (2019). At each time, we measure how far we are outside the band that is allowed by observations and we normalize it by the width of this band. Then, we quantify the discrepancy by choosing the time at which this relative discrepancy is largest. In this case, we would then calculate the relative distance as Then, we convert this maximum relative discrepancy into a goodness-of-fit parameter via This formula provides the correct limits of p SFR = 1 for a good agreement between model and observation and an exponential penalty if the prediction of a-sloth is outside the observed values. The factor of 4 is an empirical weighting to give this observational constraint similar importance compared to the five observables. Dubois et al. (2021) show with a 3D hydrodynamical simulation that the largest uncertainty to the SFRd is the cosmic variance in their zoomed-in simulated volume of approximately 4187 cMpc 3 . Our effective volume is smaller, which might even worsen this effect. Crosby et al. (2013) find with their SAM that cosmic variance between different random realizations of the same cosmic volume can result in SFRd that differ by up to a factor of two. This is of the same order as the observational uncertainty of the SFRd at high redshift. We plan to investigate this variance in detail in a future study in a larger effective volume.

Calibration Strategy
Testing each new combination of parameters requires us to run a-sloth using the 30 Caterpillar trees and the 8 Mpc box, and takes about 1 h wallclock time. Exploring the entire high-dimensional parameter space to obtain the full posterior distribution is prohibitively expensive for this optimization problem.
Therefore, we first identify three input parameters that are degenerate with other parameters and to which the observables are not very sensitive: M min , α III , and f esc,III . The minimum mass of Pop III stars is degenerate with the star formation efficiency, as a smaller value of M min can be compensated with a larger value of η III . The escape fraction for Pop III stars is degenerate with that for Pop II stars, and the slope of the Pop III IMF is also degenerate with the maximum mass of Pop III stars and their star formation efficiency (SFE). Furthermore, we have verified that these three parameters have only a small effect on the fit of our six observables. We therefore keep them fixed during the calibration and will study their influence in more detail in a follow-up study.
The values of these three fixed parameters are chosen and motivated as follow. There are no observed metalfree stars in the MW, which constrains M min 0.8 M (Hartwig et al. 2015;Ishiyama et al. 2016;Magg et al. 2019;Rossi et al. 2021). On the other side, we have seen evidence of the chemical fingerprint of supernovae from metal-free progenitor stars with masses of ∼ 10 M (Ishigaki et al. 2018). We have chosen our fiducial M min = 5 M between those two limits.
A logarithmically flat IMF slope (α III = 1.0) puts equal importance on each logarithmically spaced mass bin. This avoids any prior assumptions about dominant mass ranges. Moreover, several hydrodynamical simulations of Pop III star formation yield results that are consistent with this IMF slope (Greif et al. 2011;Stacy et al. 2016;Wollenberg et al. 2020;Sharda et al. 2021;Latif et al. 2022).
The Pop III SFRd in a-sloth is rather high compared to other simulations at z > 20 as we show below. In order to avoid that the Pop III contribution to reionization requires an even higher SFRd, we set f esc,III = 60%, the highest value that is allowed by theory and observations (Dayal & Ferrara 2018).
In Table 3, we list the six remaining free parameters of our model and the parameter ranges that we have explored while searching for an optimal model fit. We aim to fit six observables with six free parameters. Two of these observables (CSMF and SFRd) are functions (and not just single values) so that the final calibration problem is over-constrained.
We expect further degeneracies of input parameters and we do not know if there is one unique global optimum, or several local optima instead. The goal of this optimisation strategy is therefore not to prove that we found the global optimum. Instead, we try to find a model configuration that reproduces all observables and that is robust to small variations in the input parameters.
We calibrate our model in two steps. First, we run a random parameter exploration in which we sample all input parameters randomly. During this step, we sample 128 different combinations of parameters. From these 128 parameter combinations, we choose the top four models that provide the highest value of p all . These four models are then used as the starting point of "Quadient Descent" (QD), a combination of quadratic fits and gradient descent that we have developed to calibrate a-sloth.
Instead of optimizing all 6 input parameters at once, we calibrate a-sloth with a sequence of onedimensional optimizations. At each QD step, we choose a random input parameter that should be explored. Ideally, we want to calculate the gradient dp all /dx i , where x i is the i-th input parameter that is currently under investigation. However, a-sloth includes random sampling of variables in various places so that the resulting gradient dp all /dx i is not a continuous function of x i . Therefore, we cannot just sample one additional point to calculate the gradient. Instead, we calculate four additional models and fit a quadratic function to the obtained p all (x). A quadratic fit is the lowest-order polynomial fit that is able to match local optima in the plandscape and four sampled points are an efficient tradeoff between sample density and computational time. The four explored values, x i , of one parameter are chosen randomly based on a normal distribution centered at the previous value. Eventually, we choose the value of x i for which the fit predicts the highest p all (inside the sampled parameter range). An example of one such QD step can be seen in Fig. 8 input parameters, the QD chains should converge to a similar solution. We stop these QD chains after around 350 iterations and analyze their explored and preferred parameter ranges in Fig. 18 in the Appendix. All chains converge to a local optimum and we cannot find any combination of input parameters that provides a significantly better fit. Three out of four chains converge to the same optimal parameters. One chain prefers a slightly higher α out . This higher value results in a better fit to the CSMF, but a worse fit to the MW stellar mass. The values for p all for this chain are similar to the values for the other three chains.
To obtain one set of optimal parameters, we proceed in the following way. We cut the first 60 iterations, which guarantees that we only take sampled parameters into account for which the chains are already exploring near the optimum. Then, we take all the best fit parameters from all chains together and use the median value as our final best-fit value. Using the median is less sensitive to outlying chains. The resulting best-fit values and observables of our best-fit model are summarized in Tab. 2. The best fitting parameters are reported in the last column of Tab. 3. While the Pop III star formation efficiency of η III = 0.38 might appear high, we note that it is defined as the fraction of cold gas that forms stars per free-fall time. It is therefore not directly comparable to numerical simulations, which usually define the SFE as mass fraction and not per characteristic timescale (e.g. Latif & Khochfar 2019). We plan to explore the full posterior distributions of these parameters in a future study. Note-The explored range of Mmax should enclose the mass range of PISNe and the upper limit for the escape fraction is based on Dayal & Ferrara (2018). Other explored ranges are based on previous experience (Chen et al. 2022) and extensive experiments.  (Jaacks et al. 2018;Salvadori et al. 2014;Pallottini et al. 2014;Visbal et al. 2020).

Further Constraints
Besides these six observables that we actively try to reproduce, we also can compare our results to further results from the literature and from independent simulations. For example, we find a similar metal enrichment history as in other simulations (Fig. 9). Our results are in excellent agreement with Visbal et al. (2020) and in good agreement with other models.
We show the SMHM relations produced by our fiducial model as gray curves in Fig. 10 . We show the SMHMs at z = 0 of 30 Caterpillar trees from our best model. At given M vir,peak , we show the mean M * (gray circles) and the standard deviation among the galaxies from each tree. We also plot the SMHM derived by Nadler et al. (2020) as gray contour, where the dark gray shows the 1 σ region and the light gray shows the 2 σ region. The SMHM derived by Garrison-Kimmel et al. (2014) is plotted in red until the observational completeness.
the mean stellar mass of galaxies at given M vir,peak along with 1σ among them. The SMHM relation derived by Garrison-Kimmel et al. (2014) is plotted in red, where the curve ends at M * = 2.9 × 10 5 M due to the observational completeness. We plot the SMHM relation of Nadler et al. (2020) as gray contours.
We find that the difference among SMHM relations from 30 Caterpillar trees is small. The larger scatter at > 10 11 M comes from the small statistics. The SMHM relations are consistent with both the relations presented by Garrison-Kimmel et al. (2014) and Nadler et al. (2020) above the observational completeness and consistent with the relation of Nadler et al. (2020) below the observational completeness. We find a steepening of the SMHM relation below M vir,peak ≈ 10 8 M . This is contrary to the finding in an earlier study (Chen et al. 2022, that utilized an earlier version of a-sloth), where they found a flattening of SMHM relation below M vir,peak ≈ 10 9 M . The flattening presented by Chen et al. (2022) comes from the high Pop II star formation efficiency (they adopted a fiducial value of η II = 2). This high η II introduces a minimal stellar mass that forms before the stellar feedback kicks in and regulates further star formation. In our fiducial model, we have η II = 0.19, which avoids such a threshold. The physical models have also been improved, compared to the one used by Chen et al. (2022), where the Pop III star formation, stellar feedback, and the general chemical model were simpler than our fiducial model. In Fig. 11, we show MDFs of the 30 Caterpillar trees and three observational estimates (Youakim et al. 2020;Bonifacio et al. 2021;Chiti et al. 2021 Fig. 12. Our Pop III SFRd is higher at high redshift. However, when we compare the cumulative stellar mass density of Pop III stars of a-sloth with other models, our prediction is in the middle of other literature values. With a-sloth we obtain a cumulative Pop III stellar mass density of 9.5×10 4 M Mpc −3 . This is within the range of other models that provide predictions from 1.8 × 10 3 M Mpc −3 (Mebane et al. 2018) to 2.9 × 10 5 M Mpc −3 (Sarmento et al. 2019).
We also analyze the recovery time between Pop III and Pop II star formation. For this comparison, we define the recovery time as the time between the first Pop III SN and the first Pop II SN in a halo. This timescale is set by the radiative and mechanical feedback of the Pop III star formation and the ability of the halo to retain or accrete new gas. We find recovery times in the range of ∼ 1 Myr − 1 Gyr. This spread is in good agreement with numerical simulations (Jeon et al. 2014;Chiaki et al. 2018;Latif & Schleicher 2020;Hicks et al. 2021;Magg et al. 2022a). Although such a large range is seen in several different simulations, its cause is still poorly understood. Chiaki et al. (2018) relate it to the ratio of the energy injected by feedback (from both photoionization and SNe) to the binding energy of the halo. In this picture, if the total energy output by a star is larger than the binding energy, the halo will undergo a long recovery process of at least several tens of millions of years. Alternatively, Magg et al. (2022a) suggest it could be caused by variations in the amount of sub-structures present in the minihalos. These additional comparisons are further confirmation that our model for baryonic physics in minihalos works well.

Calibration of EPS Mode
To test EPS trees and our implementation of stochastic feedback, we compare three different cases. First, we run a Caterpillar tree with spatial feedback (our fiducial model). Then, we run exactly the same merger tree, but ignore the spatial information and use the stochastic feedback instead. Those two runs should produce similar results. Then, we run a third case in which we use an EPS-generated merger tree of a halo with the same mass as the MW with stochastic feedback. Ideally, the essential physics can be implemented appropriately, despite the difference in merger tree details, so that also this last case should produce similar results. However, with EPS, we follow only the formation history of the MW main halo and ignore subhalos and environmental effects. Therefore, we do not expect that this EPSbased run produces identical results to the Caterpillarbased runs. Previously, we have confirmed that external feedback affects the observables and that about 10% of halos are enriched externally before they are enriched internally (Tarumi et al. 2020).
The comparison between different models of external feedback can be seen in Fig. 13 and Fig. 14 Fig. 11 and Fig. 12). We focus on the difference between our three models and the SFRd in a MW-like volume should not reproduce the cosmologically representative SFRd. Hence, we do not focus on the comparison to literature here, but rather provide them to guide the eye. The two models that are based on Caterpillar merger trees (blue, orange) provide similar results, whereas the EPS-based model differs at high redshift and low metallicity.
runs that are based on the same Caterpillar merger tree result in very similar star formation rates. The MDFs are slightly different because we do not accrete externally enriched metals in the run with stochastic feedback. The run based on EPS merger trees deviates from the spatially resolved merger trees at high redshift and low metallicities. . We compare three different approaches to model a MW-like galaxy with and without spatial information. The gray lines and symbols are models from the literature (compare Fig. 5 and Fig. 9). These literature models should only guide the eye since a MW-like volume is not representative and these simulations do not intend to reproduce these literature values. The early peak of ionizing radiation in the EPS-based model is due to a smaller effective volume and due to a higher Pop III SFRd.
merger trees are very similar. To obtain an estimate of the total volume, we divide the total mass of a merger tree at z = 0 by the mean cosmic density. Since the MW is not cosmologically representative, we do not expect to reproduce the ionization history or cosmic evolution of metals in these test scenarios.
Overall, we see that our implementation of stochastic feedback is working. However, EPS trees are in general not very well suited to reproduce observables because of their simplified structure that cannot reproduce a MWlike galaxy with its satellites and cosmic environment. a-sloth is calibrated and intended for the use with spatially resolved merger trees. We recommend EPS merger trees only for initial tests and the use of spatially resolved merger trees for all scientific applications.

SUMMARY
a-sloth is the first public SAM that was designed to simulate the formation of the first stars and galaxies. In this paper, we have shown how a-sloth can reproduce six observables with our calibration strategy: the ionization history of the Universe, the stellar mass of the MW and its satellite galaxies, the relative fractions of EMP stars, and the SFRd at high redshift.
We have shown that a-sloth is computationally efficient and has versatile applications (Sec. 1.1). Users can start experimenting with the memory-friendly EPS mode, and then run scientific experiments on publicly available merger trees. a-sloth includes sophisticated models for external feedback (Sec. 2.5), inhomogeneous metal mixing (Sec. 2.5.3), and baryonic physics (Sec. 2.2) inside the first galaxies. All modules and parameters can be modified by the user in order to study new physics. It is the first SAM that samples the formation of individual Pop III and Pop II stars, which allows to model feedback by individual SNe in dwarf galaxies.
In addition to this code release paper, we also plan to publish the main data products of a-sloth that can directly be used by the community, such as SFRd, recovery time distribution, and MW satellite properties with convenient fitting functions.
We also expect that a-sloth can contribute to new scientific discoveries in the next decade, by optimizing surveys with JWST or interpreting LISA and third generation terrestrial detector observations. For example, a-sloth can predict the optimal redshift to search for direct detections of Pop III SNe (Magg et al. 2016;Hartwig et al. 2018a) or the contribution of Pop III remnant binary black holes to the data stream of GW detectors. With these next-generation telescopes and surveys, we will greatly extend the frontier of observational cosmology, which is bound to result in serendipitous discoveries of unknown objects and events. Given its inherent adaptability and computational speed, a-sloth is ideal for quickly interpreting such frontier observations, and helping to define follow-up campaigns.
We thank Boyuan Liu, Britton Smith, Anna Schauer, and Betül Uysal for valuable discussions. We also thank Stefania Salvadori, Rosa Valiante, and the anonymous referee for constructive feedback. We acknowledge the work of Lacey & Cole (1993) and Parkinson et al. (2008), who developed galform, based on which we started the implementation of a-sloth. We acknowledge funding from JSPS KAKENHI Grant Numbers 19K23437, 20K14464, and 20H05845. The team in Heidelberg is thankful for support from the Deutsche Forschungsgemeinschaft (DFG) via the Collaborative Research Center (SFB 881, Project-ID 138713538) "The Milky Way System" (sub-projects A1, B1, B2 and B8) and from the Heidelberg Cluster of Excellence (EXC 2181 -390900948) "STRUCTURES: A unifying approach to emergent phenomena in the physical world, mathematics, and complex data", funded by the German Excellence Strategy. They also acknowledge funding form the European Research Council in the ERC Synergy Grant "ECOGAL -Understanding our Galactic ecosystem: From the disk of the Milky Way to the formation sites of stars and planets" (project ID 855130). MV acknowledges funding from the European Research Council under the European Community's Seventh Framework Programme (FP7/2007. The project benefited from computing resources provided by the State of Baden-Württemberg through bwHPC and DFG through grant INST 35/1134-1 FUGG, and from the data storage facility SDS@hd supported through grant INST 35/1314-1 FUGG. We also acknowledge the Texas Advanced Computing Center (TACC) for providing HPC resources under XSEDE allocation TG-AST120024. And we acknowledge the Leibniz Computing Center (LRZ) for providing HPC resources in the project pr74nu.

A.1. Fitting Function of Halo Concentration
We follow the fitting functions in Correa et al. (2015) to compute the halo's concentration at 0 ≤ z < 10 and adopt a constant halo concentration of 3.5 at z > 10. At 0 ≤ z < 4, we have: where v BC σ rms (z) is given by Equation A6 above. The full derivation of M 0 can be found in Appendix D in Chen et al. (2022).
In Fig. 15 we show the evolution of M crit vs. redshift for several different values of v BC for the three implemented models.

A.3. Technical Implementation of Spatial Feedback
Determining which ionized and enriched bubbles a halo sits inside of in its most simple implementation requires computing the distance between each possible pair of halos. This would lead to the computational cost of the simulations scaling as N 2 for N halos and implies having to compute trillions of distances on each timestep. To avoid this we use a tree-based approach that for a given target halo limits the number of halos to which we need to compute the distances.
In this approach we build an oct-tree of bubbles and consecutively add all bubbles during a timestep to it. The tree is started with a head-node encompassing the entire physical volume. This node is split into 8 sub-nodes (i.e., it is split in two along each dimension) and the sub-nodes are split into 8 smaller nodes. We limit the tree to a depth of 20 levels, but only allocate nodes that actually contain at least one bubble. Each bubble is associated with exactly one node of this oct-tree, namely the smallest node that fully encompasses the bubble. Thus, large bubbles, and bubbles that are very close to edges of the larger nodes are located in the first few levels, small bubbles are located in the lower levels. This is illustrated in Fig. 16.
In order to find all bubbles that overlap with a point (e.g., the center of a halo), one only has to test bubbles associated with tree-nodes the point is inside of. This algorithm results in identical results as testing all pairwise combinations of halos at a substantially reduced computational cost.

A.4. Fitting Functions for Inhomogeneous Metal Mixing
In Tarumi et al. (2020), we obtain fitting functions for internal and external enrichment separately. For internal enrichment, we use a Gaussian distribution function with µ = −0.03, σ = 0.15: p(x) = 1 2π(0.15) 2 exp (x + 0.03) 2 2(0.15) 2 . (A11) Figure 16. Illustration of the oct-tree construction for determining which bubbles a halo lies inside of. The colors represent the levels (black: 1st level, blue: 2nd level, orange: 3rd level). The bubbles (circles) are colored according to the node they are associated with. Each bubble is associated with the smallest tree node it is completely contained within. Table 4. Parameters to be used for the external enrichment fitting function. These parameter sets are for "−dZ", not dZ itself.
B. SUPPLEMENTARY INFORMATION ON CALIBRATION

B.1. Random Parameter Exploration
The first step of our calibration is that we run 128 different models with random input parameter combinations. In Fig.17 we show how the goodness-of-fit parameters depend on the input parameters. For example, the optical depth is sensitive to the ionizing escape fraction and the CSMF of MW satellites is sensitive to our normalization Figure 17. Correlations between the input parameters (horizontal) and goodness-of-fit parameters (vertical) for 128 randomly chosen combinations of input parameters. This plot illustrates which observables are sensitive to which input parameters. mass of supernova feedback. Furthermore, the maximum mass and SFE of Pop III stars is sensitive to EMP-based quantities. In the last row (p all ) we see which parameter ranges are preferred or disfavored. For example, models with M max > 200 M can reach higher values of p all than models with M max < 200 M . Overall, this representation demonstrates that such a random parameter exploration in this 6D space is not effective to find a global optimum. Hence, we only use it as a first exploration of the parameter space and then optimize the calibration with QD.

B.2. Quadient Descent
In Fig. 18, the four colors show four independent chains. Each point shows one explored model and each step consists of 4 different models. The four chains converge to optimal values for all input parameter. Only the orange chain prefers a slightly higher value for α out . As we can see in the bottom two panels, this higher value of the feedback efficiency exponent yields a better fit to the CSMF, but a worse fit to the MW stellar mass. This is a legitimate trade-off between two individual goodness-of-fit parameters, and p all converges to similar values for all four chains. Figure 18. Iterative exploration and optimization of the input parameters (left) and how the goodness-of-fit parameter change during this optimization (right).