Late-end reionization with aton-he: towards constraints from Lyman-α emitters observed with JWST

We present a new suite of late-end reionization simulations performed with aton-he, a revised version of the GPU-based radiative transfer code aton that includes helium. The simulations are able to reproduce the Ly 𝛼 flux distribution of the E-XQR-30 sample of QSO absorption spectra at 5 ≲ 𝑧 ≲ 6 . 2, and show that a large variety of reionization models are consistent with these data. We explore a range of variations in source models and in the early-stage evolution of reionization. Our fiducial reionization history has a midpoint of reionization at 𝑧 = 6 . 5, but we also explore an ‘Early’ reionization history with a midpoint at 𝑧 = 7 . 5 and an ‘Extremely Early’ reionization history with a midpoint at 𝑧 = 9 . 5. Haloes massive enough to host observed Ly 𝛼 emitters are highly biased. The fraction of such haloes embedded in ionized bubbles that are large enough to allow high Ly 𝛼 transmission becomes close to unity much before the volume filling factor of ionized regions. For our fiducial reionization history this happens at 𝑧 = 8, probably too late to be consistent with the detection by JWST of abundant Ly 𝛼 emission out to 𝑧 = 11. A reionization history in our ‘Early’ model or perhaps even our ‘Extremely Early’ model may be required, suggesting a Thomson scattering optical depth in tension with that reported by Planck, but consistent with recent suggestions of a significantly higher value.

Attempting to delineate this phase in time, several observational probes have been employed to constrain the onset and end of the EoR.Observations of Thomson scattering of CMB photons by free electrons formed during reionization offers valuable insights into the timing of this epoch (Kogut et al. 2003 ;Planck Collaboration VI 2020 ).The completion of reionization is now believed to have occurred between z = 5 and 6, inferred from the accelerated evolution of the Ly α optical depth at z > 5 (Becker et al. 2001 ;Fan et al. 2006 ), concomitant with the observed decline of Ly α emission from high-redshift galaxies (Stark et al. 2010 ;Choudhury et al. 2015 ).
To gain a thorough understanding of this epoch, various teams have performed a range of diverse simulations to replicate the later stages of reionization.These efforts can be grouped into two categories: the first involving coupled hydrodynamics and radiative transfer (Gnedin 2014 ;Rosdahl et al. 2018 ;Ocvirk et al. 2020 ;Kannan et al. 2021 ), and the second using the technique of postprocessing hydrodynamical simulations with radiative transfer (Iliev et al. 2006 ;Aubert & Teyssier 2010 ;Eide et al. 2018 ; see re vie w by Gnedin & Madau 2022 ).In post-processing simulations, the absence of coupling between radiation and hydrodynamics results in the omission of the hydrodynamic response of the IGM to inhomogeneous photoheating, and limits the detailed modelling of MNRAS 533, 2843-2866 (2024) galaxy populations and the escape of ionizing radiation.None the less, as the pressure smoothing scale of the IGM is small, neglecting the radiation-hydrodynamic coupling does not significantly affect the spatial distribution of ionized and neutral regions on large scales (Kulkarni et al. 2015 ;Puchwein et al. 2023 ).
On the other hand, these simulations offer computational advantages, being more economically feasible.They are particularly effective in reproducing the large-scale characteristics of the Ly α forest during and post-reionization, provided that the ionizing emissivity is well calibrated (Kulkarni et al. 2019a ;Keating et al. 2020a ).Note also that more recently Puchwein et al. ( 2023 ) presented a hybrid scheme where the result of hydrosimulations post-processed with ATON were used to capture the hydrodynamic response of the IGM to inhomogeneous reionization iteratively by re-running the hydrosimulations with inhomogeneous photoheating rates derived from the post-processed simulations, offering an interesting compromise.
The work by Kulkarni et al. ( 2019a ) and Keating et al. ( 2020a ) utilized the hydrogen-only radiative transfer code ATON , with a single photon frequency, to construct models that accurately reproduce the substantial spatial fluctuations in the ef fecti ve optical depth of the Ly α absorption in QSO spectra at z > 5 (Bosman et al. 2018 ).These simulations, ho we ver, did not include helium, and the ad hoc increase in photon energy invoked to imitate the effect of helium, along with the lack of multiple frequencies, somewhat restricted the scope of these simulations.More precise comparisons between the IGM temperature at z < 6 and observations will necessitate a more comprehensive model of the IGM.Such a model should encapsulate both hydrogen and helium, along with their interactions with higher energy photons (Ciardi et al. 2012 ).More recent data also asks for better modelling.Bosman et al. ( 2022 ) have presented measurements of the ef fecti ve Ly α opacity of the IGM at redshifts 5-6 from high resolution, high SNR quasar spectra from the E-XQR-30 data set (D'Odorico et al. 2023 ).This data set is the combination of quasar spectra found by the ESO Large Programme 'XQR-30', and 12 spectra from the XSHOOTER archi ve.These ne w measurements offer more stringent constraints on the distribution of neutral islands in the early Universe and extends the observational reach to higher redshifts.Consequently, these enhancements in observational data quality necessitate a re-e v aluation and recalibration of existing models to ensure that they reproduce the most current and accurate observations.
Here, we build on the work of Kulkarni et al. ( 2019a ) and Keating et al. ( 2020a ) by incorporating helium into ATON and conduct multifrequency simulations calibrated to the mean Ly α flux measurements by Bosman et al. ( 2022 ).We also investigate the effect of the rarity of the sources, studying a range of source populations and their contributions to the o v erall photon budget necessary for reionization.It is widely assumed that galaxies serve as the primary sources of ionizing photons (Bunker et al. 2004 ;Bouwens et al. 2015 ;Finkelstein et al. 2019 ;Ocvirk et al. 2021 ), whereas highredshift active galactic nuclei are assumed to have a limited impact (Shapiro & Giroux 1987 ;Songaila, Cowie & Lilly 1990 ;Meiksin & Madau 1993 ;Ciardi et al. 2000 ;Bolton et al. 2005 ;Volonteri & Gnedin 2009 ;Fontanot, Cristiani & Vanzella 2012 ;Madau & Haardt 2015 ;Qin et al. 2017 ;K ulkarni, Worseck & Henna wi 2019b ).Ho we ver, uncertainties about the specific galaxy types involved and their associated mass ranges remain.Following Cain et al. ( 2023 ), in addition to our fiducial model, we developed simulations employing 'Oligarchic' and 'Democratic' source models.The Oligarchic model e v aluates reionization driven by rarer bright sources, and the Democratic model assesses the possibility that the total ionizing emissivity is dominated by lower mass faint galaxies (see Hassan et al. 2022 for a recent discussion of the effect of different source models on reionization).
JWST has opened a new chapter in reionization studies with its much impro v ed spectroscopic capabilities allowing to obtain highquality spectra of reionization epoch galaxies and active galactic nuclei (AGNs) to z = 11 and beyond.Galaxies emitting Ly α radiation are an important probe of the neutral hydrogen fraction of the IGM (Zheng et al. 2017 ;Mason et al. 2018Mason et al. , 2019 ; ;Nakane et al. 2024 ;Umeda et al. 2024 ).Recent spectroscopic observations conducted by JWST have confirmed the presence of galaxies emitting Ly α radiation at a redshift of z ∼ 11 (Tacchella et al. 2023 ).Ly α emitting galaxies must inhabit large ionized regions with radii 0 .5 pMpc (Miralda-Escud é 1998 ; Mason & Gronke 2020 ;Hayes & Scarlata 2023 ) to a v oid scattering of the Ly α photons out of the line of sight by the parts of the IGM where hydrogen is still neutral (Mesinger et al. 2014 ;Weinberger et al. 2018 ;Keating et al. 2024 ;Tang et al. 2024 ;Witstok et al. 2024 ).To study the prospects of constraining earlier stages of reionization, we construct three reionization models, with a deliberately wide range of early stage evolution.In addition to our fiducial simulation that has a mid-point of reionization at z ∼ 6 .5, we construct an 'Early' model with a midpoint of reionization at z ∼ 7 .5, a new 'Extremely Early' model with a mid-point of reionization at z ∼ 9 .5. All three models are calibrated to the Ly α mean transmission at redshift 6.The fiducial model is similar to the model of Kulkarni et al. ( 2019a ) and the fiducial model of Keating et al. ( 2020a ).We will see below that only the 'Early' and fiducial models are consistent with the Thomson scattering optical depth reported by Planck (Planck Collaboration VI 2020 ), but see recent work by de Belsunce et al. ( 2021 ) and Giar è, Di Valentino & Melchiorri ( 2023 ) for suggestions that the Thomson scattering optical depth may need to be revised upwards.
The paper is structured as follows.Section 2 describes our simulation set-up.Section 3 discusses our calibration approach and examines the effects of the number of frequency bins, inclusion of helium, and source SED on v arious observ ables.Section 4 explores variations of the source model.In Section 5 , we assess the impact of varying the mid-point of reionization.Section 6 describes our method for calculating ionized bubble sizes, discusses the source distributions within ionized regions, and draws implications for the visibility of LAE at high redshifts with JWST .The broader context of our findings is discussed in Section 7 , comparing our work with other studies in the field.The paper concludes in Section 8 , summarizing our main findings.Throughout this work, we assume a Lambda cold dark matter cosmology with parameter values from Planck Collaboration XVI ( 2014 ) ( m = 0 .308 , λ = 0 .6982 , h = 0 .678 , b = 0 .482 , σ 8 = 0 .829, and n = 0 .961).

S I M U L AT I O N S E T-U P
Our reionization simulations are performed by post-processing cosmological hydrodynamical simulations for radiative transfer using ATON-HE , which is a modified version of the A TON code.A TON-HE incorporates helium into ATON and o v erhauls the multifrequenc y implementation of the original code.

Cosmological hydrodynamical simulations
We post-process simulations from the Sherwood-Relics suite of simulations (Puchwein et al. 2023 ) that were run using the Tree-PM SPH code P-GADGET-3 .The suite consists of se veral dif ferent box sizes, out of which we use the box with size 160 cMpc /h for our models.The simulations have 2 × 2048 3 gas and dark matter Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 MNRAS 533, 2843-2866 (2024) particles.In our previous work, we have identified 160 cMpc /h as the 'sweet spot' for calibration of 2048 3 simulations to the flux PDF (Kulkarni et al. 2019a ;Keating et al. 2020b ).Note, ho we ver, that as discussed in more detail in Feron et al. ( 2024 ) even at this dynamic range simulations with a box size of 160 cMpc /h are not fully converged with regard to either resolution or box size.The simulations are initialized at a redshift z = 99, and snapshots are saved at 40 Myr intervals down to redshift z = 4. Star formation in these simulations is reduced to a simplified recipe (invoked using the QUICK LYALPHA compile-time flag in P-GADGET-3 ): when gas particles exceed a density threshold of = 10 3 , with a temperature 10 5 K, they are removed from the hydrodynamic calculations and converted into star particles (Viel, Haehnelt & Springel 2004 ).As ATON-HE requires a uniform Cartesian grid, we project the gas density on to a grid of 2048 3 cells.

Radiati v e transfer with ATON
ATON is a radiative transfer code that reduces the dimensionality of the radiative transfer equation by taking angular moments of the cosmological radiative transfer equation (Aubert & Teyssier 2008, 2010 ).The equations are truncated at second-order using the M1 relation (Levermore 1984 ), creating a closed system of equations.This approach converts the radiative transfer equations into a set of equations for energy and flux.In this paper, we ran ATON with multiple frequency bins and incorporated helium.We refer to this revised version of ATON as ATON-HE , and provide a detailed description of this code in Appendix A .In an ATON-HE simulation, after establishing the initial gas and source distribution, sources are placed at the locations of dark matter haloes following a source model.The spectrum of each source is modelled using a black-body spectrum.The temperature of the black-body spectrum allocated to the ionizing sources, the breakdown of this spectrum into discrete frequency bins, and the associated total volume emissivity, are input parameters for ATON-HE .

Source distribution
In order to model the ionizing sources we assign ionizing emissivities to the haloes identified in the simulations.For this we extract two properties of the haloes from the simulations: the mass and the number of haloes in each mass bin.The mass function of the haloes and its evolution with redshift is illustrated in Fig. 1 , with colour representing the redshift.We see that the simulation struggles to resolve haloes with mass below ∼ 10 9 M h −1 .We therefore employ a cut-off of 10 9 M h −1 on halo mass and only use haloes more massive than this in our source modelling.The vertical line in Fig. 1 shows this halo mass cut-off.In our fiducial simulations, the first set of sources greater than the cut-off mass form at z = 19 .55, with a minimum and maximum halo mass of 10 9 M h −1 and 1 .8 × 10 9 M h −1 .As the simulation proceeds, by z = 6, the minimum mass remains the same while the maximum halo mass increases to 4 . 1 × 10 13 M h −1 .

Source spectrum
Fig. 2 illustrates the black-body source spectrum, its discretization, and the effect of the assumed black-body temperature used to characterize the ionizing spectrum of sources in our simulations.The figure depicts black-body spectra for two distinct temperatures 4 .0 × 10 4 and 7 .0 × 10 4 K, each discretized for computation into  energy bins, and highlighted by different shaded re gions.F or hydrogen-only simulations, using a single-frequency bin is often sufficient as e.g.discussed by Keating, Puchwein & Haehnelt ( 2018 ) and by Kulkarni et al. ( 2019a ).Ho we ver, for helium, which has three ionization states, a single frequency approach is not adequate.The spectrum is thus divided based on the ionization energies of hydrogen and helium, represented by the solid vertical curves at 13.6, 24.6, and 54.4 eV in Fig. 2 .We conducted a convergence test, as detailed further in Appendix D , and found that with four energy MNRAS 533, 2843-2866 (2024) bins the thermal evolution is sufficiently converged when modelling both hydrogen and helium (cf.Mirocha et al. 2012 ).
After discretizing the spectrum, the next task is to ascertain the average energy for the photons within each bin, given by where E i is the lower end of an energy bin, and I ν and E ν are the specific intensity and energy at frequency ν (Pawlik & Schaye 2011 ).
The dotted vertical lines in Fig. 2 represent this average energy for each energy bin.The average energy, as calculated by equation ( 1), is 17.4, 28.2, 40.5, and 58.3 eV for the black-body temperature of 4 .0 × 10 4 K, and is 18.4, 29.4, 42.5, and 61.8 eV for the black-body temperature of 7 .0 × 10 4 K.

Ionizing emissivity
An ionizing emissivity is assigned to each source halo.In our fiducial model, we follow Kulkarni et al. ( 2019a ) and Keating et al. ( 2020b ), and assume that the ionizing emissivity is proportional to the total halo mass.This is moti v ated by the fact that the star formation rate at these redshifts is expected to be approximately proportional to the halo mass.We set a total ionizing volume emissivity value, and distribute this emissivity o v er the haloes accordingly.We then calibrate the emissivity to obtain a range of evolutionary histories that are consistent with the observed mean Ly α transmission at 5 z 6 .2, as described in Section 3.1 .A full simulation run from z = 19 .3 to z = 4 .9 for a multifrequency simulation including helium requires approximately 900 GPU-hours, utilizing 32 NVIDIA A100 GPUs.

The simulation suite
With the method described abo v e, we created a suite of simulations, each varying in certain physical parameters, as detailed in Table 1 .All simulations were conducted with a box size of 160 cMpc h −1 .The first aspect we examined was the impact of the number of frequency bins, ranging from one to four.This led to simulations being categorized as either monofrequency or multifrequency.Additionally, we included simulations with and without helium to assess its influence on our results.
Another parameter of interest was the temperature of the blackbody spectrum of the ionizing sources.We tested two different temperatures for this purpose: 4 × 10 4 and 7 × 10 4 K.This allowed us to study the effects of varying source spectra on the simulations.
We further investigated the role of the minimum mass cut-off and the mass distribution for sources emitting ionizing photons.In this context, we focused on three distinct models: fiducial, 'Democratic', and 'Oligarchic' (Cain et al. 2023 ).The fiducial model assumes that the emissivity of the sources is proportional to the total halo mass, with the minimum mass set at 10 9 M h −1 .The Democratic model, in contrast, assigns the same ionizing emissivity to all sources, regardless of the halo mass, thereby enabling us to assess the effect of small mass haloes.The Oligarchic model selects a higher mass cut-off and maintains a distribution proportional to the total halo mass, thereby allowing us to examine the role of high mass galaxies during reionization.
To complete our series of cosmological simulations, we vary the redshift of the midpoint of reionization, making it larger than in our fiducial model in an 'Early' and 'Extremely Early' model.
Using our simulations, we can now investigate the impact of helium and a multifrequency treatment of radiative transfer.Our corresponding results are shown in Fig. 3 .By comparing with results from the Mono-H-40 and Mono-H-70 runs, the Multi-H-40 and Multi-H-70 runs allow us to isolate the effect of increasing the number of photon energy bins.Comparing with the Helium-40 and Helium-70 runs allows us to understand the role of helium.We focus on these six simulations in this section.

Calibration to the mean Ly α flux
We randomly extract 6400 distinct sight-lines from each simulation snapshot, at the same spatial resolution as that of the density grids used by ATON-HE .For these sight-lines, we calculate both the mean flux and the distribution of the ef fecti ve optical depth along these sight-lines.Ly α optical depths were computed using the approximation to the Voigt profile as described in Tepper-Garc ía ( 2006 ).
The mean flux in the simulated spectra is then calibrated to that from observed QSO absorption spectra at z > 5 .0 as measured by Bosman et al. ( 2022 ).The volume emissivity is modified manually until the mean flux agrees with the observations for redshifts 5 z 6 .2. In Fig. 3 (a), we show the evolution of the mean Ly α flux with redshift.
The observed mean flux is shown as black points.We succeeded to reach good agreement with the observed data for all of our models.As we show in Appendix C , a 5-10 per cent disagreement in the mean flux between different calibrated runs is permissible.
The ef fecti ve optical depth τ eff is calculated by the conventional definition, F = e −τ eff , measured in 50 cMpc h −1 segments of the Ly α forest.

Ionizing volume emissivity
In Fig. 3 (c), we show the redshift evolution of the volume emissivity of our models.A consistent feature across all models is the necessity of a drop in the emissivity at z < 7, which tends to level off as reionization approaches its conclusion, roughly at z = 5.Within the context of current models, this decline in emissivity remains challenging to explain (Cain et al. 2024 ), but see Ocvirk et al. ( 2021 ) and Qin et al. ( 2021 ) for discussions of this drop in the conte xt of radiativ e suppression of star formation and modelling with 21CMFAST , respectively.Note that in the six models that we discuss in this section we kept the evolution of the ionizing volume emissivity the same at z 9. Below z = 9, the multifrequency simulations fitting the observed Ly α forest data have a somewhat lower emissivity compared to their monofrequency counterparts.This lo wer emissi vity is due to the combined effects of less absorption near the host halo due to the smaller cross-section for harder photons and the somewhat higher temperature and thus lower recombination rates in the multifrequency simulations.
Comparing our fiducial 'Helium-40' and the 'Helium-70' simulation with the multifrequency, hydrogen-only simulations 'Multi-H-40', and 'Multi-H-70', it is evident that the simulations including helium require a larger emissivity.The reason is the absorption of photons by helium, which otherwise would have ionized hydrogen.For this reason, at redshifts when the ionizing volume emissivity is still the same (at z 9), the hydrogen neutral fraction is the largest in the simulations including helium.To match the Ly α forest data at lower redshift, a larger emissivity is therefore required, which is why the simulations including helium require the highest emissivity at z 9. Fig. 3 (d) shows the evolution of the ratio of the integrated number of ionizing photons per hydrogen atom as a function of redshift.Similar to the ionizing emissivity we find that the multifrequency simulations require the lowest number of photons per hydrogen atom ( ∼ 2 .3), while the simulations including helium require the largest ( ∼ 2 .5).

IGM temperature
Moving from the monofrequency, 'Mono-H-40' and 'Mono-H-70', to the corresponding multifrequency hydrogen-only simulations, 'Multi-H-40' and 'Multi-H-70', leads to a discernibly higher IGM temperature at mean density (Fig. 3 f).This higher temperature for the same assumed source SED is due to the inclusion of higher energy photons in the multifrequency simulations that inject more energy into the IGM.
Conversely, the temperatures in the simulations in the fiducial 'Helium-40' and in the 'Helium-70' simulation are slightly lower than in their hydrogen-only multifrequency counterparts.This is due to the larger ionization energy of helium that results in a smaller amount of energy injected into the IGM, leading to lower temperatures, especially noticeable at z > 9.As already discussed up to z ∼ 9 the models have the same ionizing volume emissivity.
The differences between the blue and green curves in the figure illustrate the effect of altering the source spectrum.Harder spectra with larger black-body temperature have a larger proportion of higher energy photons.The higher energy photons inject more energy into the IGM, explaining why the green curves consistently exceed the blue ones.
The peaks in the temperature-redshift curves thereby correlate closely with the reionization history .Notably , in the multifrequency hydrogen simulation 'Multi-H-70' with a black-body temperature of 7 × 10 4 K reionization finishes a little bit earlier than in the corresponding 'Helium-70' simulation, leading to the temperature curve peaking at slightly higher redshift.Note that an earlier conclusion of reionization leads to adiabatic cooling exceeding photoheating earlier.In contrast, with a source black-body temperature of 4 × 10 4 K, in the simulations with and without helium ('Multi-H-40' and the fiducial 'Helium-40', respectively) reionization completes at around the same time.Consequently, their temperature curves peak around the same redshift, but there is still a noticeable temperature difference as expected due to the presence or absence of helium.In essence, the exact timing of the peak in the temperature is governed by the conclusion of reionization, while the amplitude is influenced by the presence or absence of helium.
Fig. 3 (f) shows a variety of temperature measurements as collected in Gaikwad et al. ( 2020 ), which served as a reference for selecting the temperature of the black-body spectrum assumed in our simulations.We aimed to encompass the range of measured temperatures, leading to the choice of black-body temperatures of 4 × 10 4 and 7 × 10 4 K. Note, ho we ver, that the temperatures in our simulations are higher than the measurements by Garzilli et al. ( 2017 ) and Walther et al. ( 2019 ), suggesting that there is still some uncertainty in what should be assumed for the source spectrum.

Neutral fraction and mean free path evolution
In Fig. 3 (Umeda et al. 2024 ), Ly α emission equi v alent widths (Nakane et al. 2024 ), and the ef fecti ve Ly α opacity of the IGM (Yang et al. 2020b ;Ning et al. 2022 ;Gaikwad et al. 2023 ).Panel (c) shows the hydrogen-ionizing emissivity in the simulations.Panel (d) shows the ratio of the integrated number of hydrogen-ionizing photons to the number of hydrogen atoms.Panel (e) compares the mean free path of hydrogen-ionizing photons in the simulations with measurements by Worseck et al. ( 2014 ), Becker et al. ( 2021 ), Zhu et al. ( 2023 ), Gaikwad et al. ( 2023 ), andSatya v olu et al. ( 2023 ).Panel (f) plots the evolution of the IGM temperature at mean density in relation to measurements by Garzilli, Boyarsky & Ruchayskiy ( 2017 ), Walther et al. ( 2019 ) and Gaikwad et al. ( 2020 ).Panel (g) shows the cumulative Thomson scattering optical depth compared to measurements of Planck Collaboration VI ( 2020 ), de Belsunce et al. ( 2021 ), andGiar è et al. ( 2023 ) of the total Thomson scattering optical depth to the last scattering surface.et al. 2014 ;Zhu et al. 2022 ), the fraction of Lyman break galaxies showing Ly α emission (Mason et al. 2018(Mason et al. , 2019 ) ), the damping wing analysis of QSOs proximity zones (Greig et al. 2017 ;Ba ˜ nados et al. 2018 ;Davies et al. 2018 ;Greig et al. 2019 ;Wang et al. 2020 ;Yang et al. 2020a ; Ďuro v č íko v á et al. 2024 ), the analysis of Lyman-series opacities (Yang et al. 2020b ;Gaikwad et al. 2023 ), the evolution of Ly α equi v alent widths (Nakane et al. 2024 ), and the damping wing analysis from galaxies (Umeda et al. 2024 ).Our models are well within the error bars of these observations.Comparing our models we see that at z 9, the fraction of neutral hydrogen in the Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 MNRAS 533, 2843-2866 (2024) simulations including helium is consistently higher compared to that in the hydrogen-only simulations.As we assumed the same number of ionizing photons emitted in these models at high redshift this is not surprising and can be attributed to some higher energy photons, which were previously ionizing hydrogen now ionizing helium instead.Interestingly, the exact timing when reionization is completed varies between the different models despite all of them being calibrated to the observed mean Ly α flux at 5 z 6 . 2 to well within the observational errors.As a consequence of these small variations in timing, there can be a difference of two orders of magnitude or more in the average neutral fraction when the IGM transitions rapidly to being highly ionized at z = 5 .5 (compare e.g. the dashed green curve and the solid blue curves).We discuss these differences further in Section 3.5 .
We have also calculated the average mean free path of Lyman-limit photons for all our models using the method discussed in Kulkarni et al. ( 2016 ).We take the average Lyman-continuum transmission across random sight-lines in the comoving frame, and fit it with an exponential with an e-folding length scale of λ mfp , where λ mfp is the mean free path, and x is the position along a sight-line (Rybicki & Lightman 1986 ).The results are plotted in Fig. 3 (e).The differences between the simulations are moderate and can be traced back to the differences in neutral fraction and the differences in spectral distribution of the ionizing photons which lead to differences in absorption cross-section.The mean free path in the simulations agree well with those measured by Gaikwad et al. ( 2023 ), but are about a factor two higher than those measured in the near-zones of QSOs at 5 .8 < z < 6 by Becker et al. ( 2021 ), Satya v olu et al. ( 2023), and Zhu et al. ( 2023 ).At lower redshift the observed measurements converge, and also agree well with our simulations.
As expected the v olume-a veraged neutral fraction and the mean free path are anticorrelated.Looking more closely at models with similar reionization histories, e.g. the 'Mono-H-40' and the 'Helium-70' models, we see that the mean free path is shorter in the case of the simulation including helium.

Effecti v e optical depth distribution
As discussed abo v e, we hav e adjusted the ionizing volume emissivity so that the mean flux in the simulated spectra reproduces the mean observed Ly α flux.Ho we ver, the full flux distribution of the ef fecti ve optical depth contains more information.In Fig. 4 , we show the cumulative distribution function (CDF) of the ef fecti ve optical depth measured o v er 50 cMpc h −1 se gments for all ten distinct simulations spanning two redshift ranges: 5.36-5.51and 5.67-5.83.For each simulation, the shaded regions denote the 15.87-84.13percentiles of 10 000 realizations of the CDF, each created using the same number of LOS as the observed data, i.e. 65 and 48, respectively.The black curves show the observational data from Bosman et al. ( 2022 ).Notably, this newer data set offers tighter constraints on the ef fecti ve optical depth CDF than the measurements by Bosman et al. ( 2018 ) and Eilers, Davies & Hennawi ( 2018 ;see in Fig. 4 (l) for a comparison) with a less pronounced tail of high ef fecti ve optical depth in the 5 .67 < z < 5 .83 redshift bin, suggesting a somewhat smaller volume filling factor of neutral gas than the other two measurements at this redshift.Overall, the agreement between observed and simulated flux CDFs is good, but matching the mean flux does not al w ays translate into reproducing the width of the observed distribution, and in particular the tail at high optical depth.For both black-body temperatures the tail at high ef fecti ve optical depth in the redshift range 5 .67 z 5 .83 becomes more pronounced, as we mo v e from monofrequency to hydrogen-only multifrequency simulation and multifrequency simulation including helium.Note ho we ver that in this redshift range, all six models nevertheless show comparable average neutral fractions.
In the redshift range between 5.36 and 5.51 shown as the left set of curves in each panel, all of our models fall somewhat short at high ef fecti v e optical depths.The discrepancies observ ed among the models at this redshift can be attributed to variations in their average neutral fractions.The 'Multi-H-70' simulation, in particular, shows a significantly lower neutral fraction, which is reflected in its CDF extending to lower effective optical depth.
When examining the CDF of the ef fecti ve optical depth, it is apparent that even if the mean flux values agree well across simulations, significant variations are present in their effective optical depth distributions.These variations are most noticeable in the tail end of the distribution curves.Due to the exponential relationship between mean flux and ef fecti ve optical depth, higher values of ef fecti ve optical depth contribute less significantly to the mean flux.This permits a wide range of variation in the high ef fecti ve optical depth tail without markedly affecting the mean flux.Another intriguing observation is how small the differences in mean flux and ef fecti ve optical depth distribution are, even when there is a substantial difference (up to two orders in magnitude or more) in the average neutral fraction between 5 z 5 .5. In this redshift range the average neutral fraction is typically quite low.In such cases, a limited number of neutral cells can significantly alter the average neutral fraction.Hence, a relatively small proportion of neutral cells strongly influences the o v erall av erage neutral fraction.Conv ersely, the mean flux along any given line of sight (LOS) is most significantly affected by the regions with the lowest density.Therefore, despite substantial variations in the average neutral fraction along a LOS, the mean flux may remain relatively constant, provided the least dense regions are similar.To summarize, the average neutral fraction is predominantly set by cells with the highest neutral fractions and hydrogen number densities.On the other hand, the flux in most of the volume is influenced more by the least dense cells.This dichotomy leads to the differences in the effective optical depth distribution and the volume averaged neutral fraction in simulations with similar mean flux.
Note that when calibrating simulations to the same mean flux, we also encounter a notable variation in the distribution of the effective optical depth CDF even for similar median ef fecti ve optical depth.This is possible because once the ef fecti ve optical depth in a 50 cMpc h −1 segment of the spectrum is large the exact value has little effect on the median ef fecti ve optical depth and the mean flux as the whole segment of spectrum consists mainly of saturated absorption.

VA RY I N G T H E S O U R C E M O D E L
In this section, we examine three distinct multifrequency helium models with a spectral black-body temperature of 4 × 10 4 K: fiducial, and following Cain et al. ( 2023 ), a 'Democratic' and a 'Oligarchic' source model.Similarly to Fig. 3 , the properties of simulations with these different source models are compared to our fiducial model in Fig. 5 by the solid blue, dashed green and dashed purple curves.The fiducial 'Helium-40' model assigns an ionizing photon emissivity to haloes in proportion to their mass ( Ṅ ion ∝ M).In the Democratic model, the total ionizing emissivity is evenly distributed across all haloes, regardless of their mass.This model allows us to assess the Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 impact of smaller haloes in the reionization process.The Oligarchic model, on the other hand, biases the emissivity towards larger halo masses by assuming a higher mass cut-off for ionizing sources at 8 .5 × 10 9 M h −1 instead of 10 9 M h −1 , thereby focusing on the influence of more massive galaxies in driving reionization.The emissivity distribution is similar to the fiducial simulations, i.e.Ṅ ion ∝ M.
Comparing the fiducial 'Helium-40' model with the Oligarchic model we see that their emissivity (Fig. 5 c) is similar at higher redshifts ( z 7), while the Oligarchic model requires slightly less emissivity at lower redshifts to fit the Ly α forest data.Despite similar emissivities at z 7, the Oligarchic model exhibits a roughly 10 per cent lower neutral hydrogen fraction (panel b) around z ∼ 7.This difference arises because the Oligarchic model with its higher mass cut-off assigns larger mass haloes an increased emissivity.The resultant earlier ionization also allows for more adiabatic cooling, slightly lowering the IGM temperature (panel f) at later times in the Oligarchic model compared to the fiducial model.The Oligarchic model also predicts a larger mean free path (panel e), suggesting a smaller number of remaining small neutral islands within the e xtensiv e ionized re gions.Fig. 4 (k) shows that the Oligarchic model is very similar to the fiducial model in terms of the distribution of ef fecti ve depths across the observed redshift range.
Comparing the Democratic model with the fiducial 'Helium-40' model, we see that the emissivity (panel c) is again similar at z 7 by construction, while the Democratic model requires higher emissivity at lower redshifts to agree with the Ly α forest data.The reionization histories (panel b) of both models agree closely at z 7, but the increased emissi vity belo w this redshift causes reionization to happen slightly faster in the Democratic model, resulting in a reduced neutral fraction.The slightly faster ionization in the Democratic model also manifests as a slightly earlier peak in the temperature profile (panel f) compared to the fiducial model.The mean free paths (panel e) for both models are almost identical, probably reflecting a similar number of remaining neutral islands within ionized zones as in the fiducial model.
Figure 5.The reionization histories and various other related quantities as in Fig. 3 for the Democratic, Oligarchic, 'Early' and 'Extremely Early' reionization models.The observational points are the same as in Fig. 3 .
model is similar to the fiducial model in both redshift ranges plotted.Fig. 6 presents a two-dimensional slice depicting the neutral fraction for the three different source models at z = 6 .15, a point where the v olume-a veraged neutral fractions of the models are similar: x H I v = 0.734, 0.798, 0.788 for fiducial, Democratic, and Oligarchic models, respecti vely.Indi vidually identifiable ionized regions in the Oligarchic model are notably larger, and have a lower neutral fraction, a reflection of the higher emissivity allocated to massive haloes.Conversely, the Democratic model exhibits more numerous smaller ionized bubbles.The differences in bubble morphology reflects the more spatially dispersed distribution of lower mass haloes and the more highly biased clustering of the higher mass haloes.The fiducial model falls between the other two source models, with the number of small ionized regions exceeding that in the Oligarchic model, but being lower than in the Democratic model.
In conclusion, we find that calibration to the mean flux is possible for very different source populations as long as the ionizing emissivity is properly chosen.In the Oligarchic model the required ionizing emissivity was thereby somewhat lower and in the Democratic model somewhat higher than for our fiducial source model.

T H E E A R LY S TAG E S O F R E I O N I Z AT I O N
While a late end of reionization is now well established thanks to the Ly α forest data, the earlier evolution of the neutral fraction is Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 ).The effect of the damping wings of the remaining neutral hydrogen IGM, either directly observed as damping wings or due to the resulting suppression of the Ly α emission from reionization-epoch galaxies and AGN, offers the exciting prospect to significantly impro v e these constraints as the number of JWST spectra increases and the modelling impro v es (Keating et al. 2024 ;Umeda et al. 2024 ).To aid these efforts, we consider, in addition to our fiducial reionization history with a mid-point of reionization at z = 6 .5, two other models with mid-points at z = 7 .5 and z = 9 .5, respectively.These models are also shown in Fig. 5 and are the fiducial (dark blue curve), 'Early' (red curve), and 'Extremely Early' (dark yellow curve), respectively.The properties of all models can be found in Table 1 .In our analysis, we utilized simulations all featuring a black-body source spectrum with the same spectral temperature of 4 × 10 4 K as in our fiducial model.The 'Early' model was run like the fiducial model as a multifrequency simulation including helium, whereas the 'Extremely Early' model was run as a monofrequency hydrogenonly simulation.As outlined in Section 3 , the inclusion of helium introduces some moderate variations in the observables, but for the purpose of examining the impact of significantly earlier reionization timing that we are pursuing here, the monofrequency hydrogen simulation we performed should be sufficient.We thus decided not to invest the significant resources required to rerun and re-calibrate this model including helium.
The various properties of the 'Early' and 'Extremely Early' model are also shown in Fig. 5 .In panel (a) the mean Ly α flux for the three reionization histories shows very small differences.Turning our attention to the emissi vity, as sho wn in panel (c) of Fig. 5 , we observe that the emissivity peaks at z ∼ 6 .5 for the fiducial, z ∼ 7 .25 for the 'Early', and z ∼ 10 for the 'Extremely Early' model.The earlier peaks in the 'Early' and 'Extremely Early' simulation were required to start reionization earlier.The larger ionizing emissivity is reflected in a lower neutral fraction at high redshifts [panel (b) of Fig. 5 ].In the 'Extremely Early' model by z ∼ 13, the volume averaged neutral fraction is ∼ 0 .8, while the fiducial model still has a value of 0.99.As reionization proceeds, the neutral fraction of the three models start to decrease at different rates and by z ∼ 6 it is 0.04 for the 'Extremely Early' model, while its 0.16 for the fiducial model.The differences persist down to z = 5 .2, after which all three models have the same volume averaged neutral fraction.Comparing to the observational points, we see that the 'Extremely Early' model, at z > 8, might be more ionized than some of the data suggests.The effect of the difference in reionization timing is also seen for the temperature at mean density, shown in panel (f).As the 'Extremely Early' simulation reionizes earlier, it adiabatically cools for a longer duration leading to a lower value of the peak of the temperature at z ∼ 6.The temperature peak of the 'Early' model is somewhat higher, while the temperature in the fiducial model is the highest, as it ionizes the fastest leaving less time for adiabatic cooling.When analysing the mean free path, it becomes clear that later reionization results in a somewhat shorter mean free path.Through the construction of these three models, we have ef fecti vely demonstrated that fitting to the Ly α mean flux still leaves room for a large variation in the reionization history.A greater number of models than perhaps initially anticipated are consistent with the mean Ly α flux data.Note, ho we ver, that the Thomson optical depth for the scattering of CMB photons for the 'Extremely Early' model is > 3 σ larger than reported by Planck Collaboration VI ( 2020 ), and is also significantly larger than those reported by de Belsunce et al. ( 2021), but is consistent with those recently reported by Giar è et al. ( 2023 , see panel g).
In Fig. 7 , we provide a visual representation of the fiducial 'Helium-40', and the 'Early' and 'Extremely Early' models, showing lightcones that chart the fraction of neutral hydrogen, starting at a redshift of 19 and e xtending be yond the completion of hydrogen reionization to redshift 5, as shown on the bottom x -axis.The sources in our models start producing ionizing photons as early as z = 19, and thus we start our lightcones from that particular redshift.Even though our simulation only encompasses a spatial dimension of 160 cMpc h −1 , we have made use of its periodic boundary conditions to produce lightcones that exceed the dimensions of the simulation box, as shown on the y -axis.By orienting the lightcone at an angle through the simulation, we ensured v aried vie wing of the box but because of the rather large coherence length of ionized/neutral regions some periodicity remains.On the top axis, we show the corresponding age of the universe.Redder (bluer) colours in the lightcones correspond to more neutral (ionized) regions.
In the 'Extremely Early' model, the already substantial imposed emissivity at the highest redshifts combined with the scarcity of sources leads to some probably unrealistically luminous sources concentrated in a small number of regions that give rise to a few, Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 but sizable, ionization bubbles at these redshifts.In the rightmost section of Fig. 7 , these manifest themselves as blue regions around z = 17 .3 in the 'Extremely Early' model (panel c), which slowly recombine as the simulation proceeds and our imposed emissivity is distributed o v er a larger number of ionizing sources.A closer examination of the bottom lightcone shows large ionized bubbles appearing as early as z ∼ 14.Ho we ver, these bubbles are still transient, primarily because of the ele v ated recombination rates characteristic of these redshifts, coupled with the redistribution of ionizing emissivity as larger numbers of sources become operational.This is a numerical artefact created due to our method of assigning ionizing emissivity to sources.The model settles quickly into a more realistic source population from z ∼ 12 onwards, with larger number of ionizing sources sharing the imposed emissi vity.Adv ancing through the simulation timeline, e xpansiv e ionized hydrogen zones take shape and merge, eventually resulting in a fully reionized universe by z ∼ 6.This progression is reflected in the evolution of the v olume-a veraged neutral fraction shown in the bottom left of Fig. 5 .In contrast, the fiducial model shown in the top lightcone of Fig. 7 is dominated by much smaller ionized regions (spanning only a few cMpc h −1 ) that persist up to z ∼ 7 .3. Subsequently, there is a swift escalation in the expansion of these bubbles, which then quickly fill the entirety of the simulated volume.Notably, the fiducial reionization model retains extended pockets of dense, neutral hydrogen, enduring until z ∼ 5 .5. Positioned between these two extremes, the 'Early' model presents a more gradual growth of the ionized regions.The first ionized bubbles appear around z ∼ 9, with significant neutral regions enduring until about z ∼ 6 .4. This disparity in neutral islands manifests as a notable difference in the CDF at high optical depth, as seen in the bottom four panels in Fig. 8 .Here, we show the effective optical depth on the x -axis, and the CDF of the ef fecti ve optical depth on the y -axis for eight redshift bins.The redshift range of each bin is shown at the bottom right and is the same as the redshift range of the observational points for the mean Ly α flux shown in Fig. 5 (a).Mirroring our analysis in Section 3.5 , we hav e e xtracted a sample of LOS from our snapshots that matched the observed number of sight-line for the respective redshift bins.This process was repeated ten thousand times, enabling us to estimate the 1 σ (68 .26 percent ) range for the ef fecti ve optical depth CDF for the different redshift bins.
The observed distribution is shown by the black curve, while the fiducial, 'Early', and 'Extremely Early' models are shown by the dark blue, red, and yello w curves, respecti vely.Starting with the highest redshift bin, the models agree reasonably well with the observed distribution.The ef fecti ve optical depth in ionized regions (low τ ) is in slightly better agreement with the data in the fiducial model compared to the other two.The ef fecti ve optical depth in the neutral regions (high τ ), on the other hand, is fit better by the 'Early' and 'Extremely Early' models.The fiducial model, with its higher v olume-a veraged neutral fraction, possesses regions that are more neutral compared to the other models.Progressing to the redshift bin of 5 .83 < z < 6, the agreement at the tail end of the ef fecti ve optical depth distribution in the fiducial model is also not perfect, suggesting slightly more neutral gas than the observed distribution.The 'Early', and 'Extremely Early models, in contrast, agree here better with the data except at the largest optical depth, where they fall short in predicting the full range of the high optical depth tail.Note, ho we ver, that the simulations snapshots contributing to the highest redshift bins are at z = 6 .14 and z = 5 .94, towards the upper end of redshifts probed by the observations in these two redshift bins.In the subsequent redshift bins, where outputs from our models are more central in the redshift bin, the fiducial model reproduces the observed distribution reasonably well.On the other hand in the 'Extremely Early' model the distribution falls somewhat short at high ef fecti ve optical depth, compared to the data.In the simulations large ef fecti ve optical depths in this redshift range correspond to the last remaining neutral islands and the model appears to produce too few of these in the redshift range 5 .5 < z < 5 .83 to be consistent with the data.The ef fecti ve optical depth distribution in the 'Early' model lies in between the other two cases.Below z ∼ 5 .4, the v olume-a veraged neutral fraction in the 'Extremely Early' model is significantly lower compared to the fiducial and the 'Early' model despite the ef fecti ve optical depth CDFs in all three models being similar.This is due to a small difference in the redshift when the last neutral island disappear and reionization fully completes.Note also, that the neutral hydrogen fraction of the 'Extremely Early' model in this redshift range is also significantly lower than the possible range of values found by Gaikwad et al. ( 2023 ).As the 'Extremely Early' model fits the ef fecti ve optical depth distribution, this would suggest that the error bars in Gaikwad et al. ( 2023 ) are somewhat underestimated in the redshift range where the last neutral regions are ionized and the v olume-a veraged neutral fraction changes rapidly.Scrutinizing the 'Extremely Early' and 'Multi-H-70' models more closely reveals another subtle aspect of reionization history impact on ef fecti ve optical depth distributions.Despite these models exhibiting almost identical v olume-a veraged neutral fractions in the redshift range 5 .6 z 6, as shown in Fig. 5 (b), their CDF of ef fecti ve optical depth differ within the range 5 .67 < z < 5 .83, as shown in Fig. 4 .
A comparison of our results with the data by Bosman et al. ( 2022) reveals the quality of these newer data compared to previous measurements.The narrower distribution, particularly at higher optical depths, makes it more difficult for the simulations to align with the full CDF.Past work presented by Keating et al. ( 2020a ) and Kulkarni et al. ( 2019a ), managed to easily achieve a fit to the observed CDF within the errors of previous measurements by calibrating to the mean flux.We found this approach more challenging when using the significantly tighter constraints of the new measurements by Bosman et al. ( 2022 ).
The disparities in the distribution of neutral and ionized regions across the models naturally lead us to look into the spatial configuration and characteristics of the ionized bubbles within these simulations.Given the recent data from the JWST that sheds new light on the relation of ionized bubbles and galaxies, it becomes pertinent to examine the spatial distribution of bubbles and ionizing sources in our simulations with different reionization histories.We will do this in the following section.

M O R P H O L O G Y A N D S I Z E S O F I O N I Z E D B U B B L E S
In the previous section, we saw that a wide range of reionization histories are consistent with the observed Ly α opacity distribution at 5 z 6 .2. As expected and as is visually apparent the different reionization histories differ strongly when the first ionized regions appear and grow into large ionized bubbles.As already discussed abo v e, the transmission of Ly α emission from reionization-epoch galaxies depends strongly on whether they are affected by the damping wings of intervening neutral gas along the line of sight to the observer (Umeda et al. 2024 ).Modelling this is complex as the Ly α transmission depends on the intrinsic systemic redshift and width of the Ly α emission line.It also depends on the peculiar velocity between galaxy and surrounding neutral gas as well as the distance of the galaxy to the bubble 'wall'.Furthermore, the neutral hydrogen column density of the bubble wall and the absorption by residual neutral hydrogen within the ionized bubble (Weinberger et al. 2018 ;Keating et al. 2024 ) also plays a role.To bypass the many uncertainties in this modelling, the size of the ionized bubble in which the reionization-epoch galaxy is located in is often taken as a proxy for high Ly α transmission.Canonically a radius of one proper Mpc is assumed to ensure high Ly α transmission (Weinberger et al. 2018 ).In this section, we have therefore quantified the bubble growth in our simulation and calculated the evolution of the fraction N h , frac of our ionizing sources as characterized by their total host halo mass that sit in ionized bubbles of a given size.We will see that N h , frac rises to unity significantly earlier than the volume averaged neutral hydrogen fraction x HI v due to the strongly biased clustering of haloes sufficiently massive to host ionizing sources at z > 6.

Calculating bubble sizes
In Fig. 9 , we demonstrate how we measure bubble sizes within our simulation, leveraging image processing techniques.The left-hand plot of the figure provides a visual representation of the neutral hydrogen distribution within a two-dimensional cross-section of our 10 cMpc h −1 simulation box.A smaller simulation box was selected to allow visual verification of the bubble count.Here, ionized regions are shown in blue, while more neutral areas are presented in red.For the bubble identification in our simulations, we first convert the colour image depicting neutral hydrogen distribution into a binary black and white version, as illustrated in the middle panel of the figure.This binary representation requires defining a threshold to distinguish between ionized and neutral regions.We set a threshold value of 0.5 for the neutral hydrogen fraction (Chardin, Aubert & Ocvirk 2012a, 2014 ), abo v e which a cell is categorized as neutral, while those below are treated as ionized.During this conversion ionized regions become white, while neutral regions are transformed into black.The next step involves applying a connected component labelling algorithm (CCL) to count and identify each ionized (white) region as a distinct bubble.This process identifies contiguous regions within a binary image where pixel values match.
We then use the python routine scipy.measure.label, with a setting connectivity = 3 that ensures that a voxel connects to its neighbours if they share any common points, to implement the CCL algorithm.Ho we ver, gi ven the periodic nature of our simulation volume, the standard algorithm o v erlooks bubbles that extend across the boundaries of the simulation volume.To rectify this we exploit the periodicity of our simulations and link such regions across the simulation boundaries.In the right-hand panel, we show the implementation of the modified CCL algorithm where each bubble has a unique colour, and number to indicate its count.The methodology applied in our simulation successfully identifies seven distinct bubbles.A particularly noteworthy example is bubble 1, highlighted in bright red, located at the left edge of the plot.This bubble shows the effect of the periodic boundary conditions, extending across both the left-to-right and top-to-bottom edges of the simulation box.This periodicity is accurately captured by our technique, underscoring its ef fecti veness in identifying and quantifying the dimensions of ionized bubbles within the simulation.This approach can be straightforwardly extended to a three-dimensional analysis.Once the bubbles are identified, we can quantify the number of cells within each bubble, which corresponds to its volume V .Assuming a spherical shape for the bubbles, we can then infer their 'radius' R using the relation V = 4 πR 3 / 3.For a simulation cube measuring 160 cMpc h −1 , this method yields a maximum radius of 99 .2 cMpc h −1 .Although the assumption of a spherical bubble is not entirely accurate, it does provide a reasonable estimate of the radius of the bubble radius for our purposes.Other bubble measuring techniques are the 'Distance transform' (Zahn et al. 2007(Zahn et al. , 2011 ) ) and 'mean free path' method (Mesinger & Furlanetto 2007 ), which assign a 'radius' to each cell in the simulation.Other methods to identify ionized bubbles employ the friends-of-friends algorithm (Iliev et al. 2006 ;Chardin, Aubert & Ocvirk 2012b, 2014 ), while others employ the watershed algorithm which also uses the CCL algorithm (Soille 2003 ;Lin et al. 2016 ).

Bubble sizes and the timing of reionization
The lightcones depicted in Fig. 7  shows the fraction of volume in ionized bubbles ( x HI < 0 .5) with radius greater than 0.5 pMpc (dotted curves), 1 pMpc (solid curves), and 3 pMpc (dashed curves) in our fiducial, 'Early', and 'Extremely Early' reionization models.For these three bubble sizes, Panel (b) shows the fraction of haloes with mass greater than 10 10 M h −1 that are located in the ionized regions in these three models.Panel (c) focuses on our fiducial model, and a minimum bubble radius R min of 1 pMpc, for which this panel shows the number of haloes with mass greater than 10 9 , 10 10 , 10 11 , and 10 12 M h −1 that are located in the ionized regions.
apply this analysis to the entire three-dimensional simulation box.
The outcomes are presented in Fig. 10 .The left plot in Fig. 10 presents a comparative analysis of the evolution of the fraction of the volume ( V frac ) occupied by ionized bubbles exceeding a certain minimum radius with redshift for different simulation models.The plot categorizes bubbles based on their size: dotted curves represent bubbles with a minimum radius of 0.5 pMpc, solid curves are for 1 pMpc, and dashed curves are for 3 pMpc.Each line colour corresponds to a different simulation model.At higher redshifts, a larger proportion of the simulation volume is occupied by smaller bubbles with radius 0.5 pMpc.Ho we ver, as the uni verse e v olves, these smaller b ubbles quickly merge or expand, leading to a convergence in the volume occupied by bubbles of different sizes.The timing of reionization onset in each model plays a crucial role in determining the bubble size evolution.For instance, the 'Extremely Early' model, in which reionization starts earliest, exhibits a remarkable 30 per cent of its total volume occupied by ionized bubbles with size 1 pMpc as early as z = 11 .5. In contrast, the 'Early' and other models show negligible volume fractions occupied by ionized regions with size 1 pMpc at this redshift.By z = 10, the 'Early' model shows a substantial increase, with around 10 per cent of its volume containing ionized bubbles 1 pMpc.Overall, these trends closely mirror the evolution of the volumeaveraged ionization fraction for each simulation.
In the middle plot of Fig. 10 , we examine the fraction N h , frac of haloes with a mass exceeding 10 10 M h −1 that are located within ionized regions of a radius greater than a specified minimum, R min .The colour coding and line styles in this plot are consistent with those in the left plot.When considering the redshift evolution of sources and bubbles we should mention that we post-process simulation snapshots taken at 40 Myr intervals which requires some attention.Consider at redshift z 1 sources emit photons, leading to the formation of ionized bubbles.By redshift z 2 , the source information is updated based on data from the hydrodynamical simulations, changing the source locations.When identifying sources within bubbles we take the bubble sizes at redshift z 2 and the source position at redshift z 1 , i.e one snapshot prior to the redshift at which the bubble sizes are calculated.Looking at the plots we can see a rapid convergence of the curves for all bubble sizes considered, mirroring the trends observed in the volume fraction analysis.Furthermore, the fraction of haloes found within ionized bubbles of size 1 pMpc escalates swiftly towards unity, indicating that shortly after the onset of reionization, the majority of these haloes are located within ionized re gions.F ocusing on the fiducial 'Helium-40' model as an example, we observe that by around z ∼ 7, all the haloes in this category are situated within ionized bubbles greater than 1 pMpc in radius.Yet, at this same redshift, these large bubbles constitute only about 40 per cent of the total volume of the simulation.This pattern is consistent across all models, underscoring a strong correlation between the locations of the haloes and the ionized regions.This correlation suggests that the formation and expansion of ionized bubbles during the reionization process are closely linked to the presence of these massive haloes.As these haloes are likely to be significant sources of ionizing radiation, their spatial distribution plays a crucial role in shaping the reionization morphology, leading to a scenario where these haloes are often found within at the centre of the ionized regions.
Fig. 10 (c) shows the distribution of haloes of varying masses within ionized regions of radius > 1 pMpc during the reionization process for the fiducial model.This plot is particularly insightful for understanding how halo mass correlates with their likelihood of being within large ionized bubbles.There is a clear trend indicating that haloes with greater mass are more rapidly encompassed by ionized re gions.This is e x emplified by the curv es representing different mass cut-offs, where the heavier the halo mass (ranging from 10 9 to 10 12 M h −1 ), the sooner it is found within an ionized bubble.Note that for the halo mass of 10 12 M h −1 the curve starts at z = 8 .56 as haloes of this mass are not present in the simulation abo v e this redshift.Furthermore, we see that as reionization nears its end, even lower mass haloes (as indicated by the blue curve for haloes with mass around 10 9 M h −1 ) are eventually found within large ionized regions.This inclusion reaches completion around z ∼ 5 .5, at which point the ionized bubbles occupy the entirety of the simulation volume, as depicted in the volume fraction shown in the left plot.
Our analysis indicates that while ionized bubbles of various sizes are present at the onset of reionization, they quickly merge with one another.The timing of the reionization onset plays a crucial role in determining when these bubbles will encompass the entire simulation volume.We also find that more massive haloes have a high probability to be found within large ionized regions shortly Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 after the start of reionization.This suggests that these haloes, due to their higher ionizing output, are mainly responsible for shaping the early structure of ionized regions.By the end of reionization, haloes across the mass spectrum, including the low-mass ones, are integrated into the ionized volume.Our fiducial model seems to not hav e an y number of high mass haloes in ionized bubbles with radius > 0 .5 Mpc at z = 10, which may be in contrast with recent JWST results claiming detections of bright Ly α emitters already at z ∼ 10 and beyond.We will discuss the implications for the optical depth for Thomson scattering of the CMB in the next section.

D I S C U S S I O N
In our simulations, we explored the effect of a range of parameters including the number of frequency bins in the radiative transfer computation, the source SED, helium chemistry, early stages of reionization, and the source models on several observables related to the EoR.Some of these parameters, in varying combinations, have also been subjects of investigation in previous studies.It is thus instructive to compare our findings with the literature.
Our work builds upon the foundations laid by Kulkarni et al. ( 2019a ) and Keating et al. ( 2020b ), who utilized the ATON code for monofrequency hydrogen simulations, calibrating them to the Ly α mean flux data from Bosman et al. ( 2018 ).Notably, Keating et al. ( 2020b ) explored scenarios with ele v ated Thomson optical depth τ CMB values.We impro v e on these simulations by calibrating them to the more recent and stringent Ly α flux constraints provided by Bosman et al. ( 2022 ), which place tighter bounds on the distribution of regions with large Ly α optical depth.Our models with higher τ CMB values, labelled 'Early' and 'Extremely Early', probe the morphological characteristics of H II regions at higher redshifts than the previous studies based on ATON simulations.We have also explored different source populations namely the 'Democratic' and 'Oligarchic' models.
Our findings with regard to the effect of including helium echo those of Ciardi et al. ( 2012 ), especially regarding the observation that the inclusion of helium can postpone the reionization process and lead to diminished temperatures in simulations including helium relative to those considering only hydrogen, particularly at redshifts z 10.A significant advantage of our approach is the considerable scale of our simulations, which utilize a box size nearly fivefold larger than that employed by Ciardi et al. ( 2012 ) and our simulations are calibrated to observed Ly α forest data.The THESAN simulation suite, presented by Kannan et al. ( 2021 ), adopts a coupled radiationhydrodynamics approach including helium with multiple frequency bins.Ho we ver, their fiducial run, as analysed by Garaldi et al. ( 2022 ), does not agree as closely with the E-XQR-30 data from Bosman et al. ( 2022 ) as our simulations do.Additionally, the THESAN suite exhibits a monotonically increasing emissivity, compared to a dip required by our models, a characteristic that Cain et al. ( 2024 ) attributes to the implementation of a reduced speed of light (see also Qin et al. 2021 ).Note here that in our models the dip is less pronounced in the Oligarchic, 'Early' and 'Extremely Early' models than in our fiducial model.Cain et al. ( 2024 ) scrutinized the influence of 'Oligarchic' versus 'Democratic' source models with ray-tracing simulations, finding the 'Oligarchic' model less fa v oured due to its higher neutral fraction.Conversely, Garaldi et al. ( 2022 ) introduced the ' THESAN-LOW-2 ' model, where sources abo v e 10 10 M contribute to ionizing radiation, but noted that such a model has a consistently lower neutral fraction than their fiducial models.Our analysis, ho we ver, re veals that adjusting the ionizing volume emissivity permits the calibration of an 'Oligarchic' model that closely tracks the 'Democratic' model in terms of its ionization history but also fits the Ly α ef fecti ve optical depth CDF.Neyer et al. ( 2024 ) and Lu et al. ( 2024 ) recently used the THESAN simulations and the 21CMFAST framework (Mesinger, Furlanetto & Cen 2011 ), respectively, to investigate the relation of ionized regions and ionizing sources.Similar to previous studies they found that galaxy o v er-densities are more likely to be found in ionized regions, and that brighter galaxies are surrounded by larger ionized bubbles.We have followed on from this here and have investigated the spatial distribution of ionizing sources within the ionized regions in our models consistent with observations of the Ly α forest.Park et al. ( 2021 ) looked at the Ly α transmission from galaxies in the coupled radiative hydrodynamical simulation CoDa II (Ocvirk et al. 2020 ) and found that bright galaxies are less affected by reionization.This has also been found in se veral observ ational studies.Endsley & Stark ( 2022 ) found that nine out of ten galaxies show Ly α emission at z = 6 .8. Harish et al. ( 2022 ) found 15 Ly α emitters (LAEs) out of a sample of 21.These observation agree with predictions from our reionization models, as illustrated in Figs 10 (b) and (c), showing that, across various timing models and halo masses, more than 85 per cent of sources are situated within ionized bubbles larger than 1 pMpc at z = 7.This bubble size is large enough for the Ly α emission to sufficiently redshift before it encounters the bubble wall, allowing into to escape without resonant scattering (Weinberger et al. 2018 ).Moreo v er, the study by Endsle y et al. ( 2021) indicated a relatively constant transmission rate of Ly α emission from massive galaxies between z ∼ 6 and 7, contrasting with the significant decline observed in less massive galaxies, as reported by Fuller et al. ( 2020 ).Our models mirror this behaviour, as evidenced in Fig. 10 (c), where we see that larger haloes are incorporated into ionized zones much earlier than their smaller counterparts.Consequently, between z ∼ 6 and 7, there is an observed increase of about 20 per cent in the proportion of low-mass (10 9 M h −1 ) sources exhibiting Ly α emission, in stark contrast to a marginal change for haloes with masses exceeding 10 10 M h −1 .
The detection of Ly α emission at high redshifts, where the IGM becomes increasingly neutral, provides important insights into the early stages of reionization.Pentericci et al. ( 2018 ), andHarish et al. ( 2022 ) have identified galaxies within the redshift range of approximately 6-7, while studies by Oesch et al. ( 2015 ), Stark et al. ( 2016 ), Jung et al. ( 2019 ), Tilvi et al. ( 2020 ) and Jung et al. ( 2020 ) have extended these findings to galaxies around z ∼ 7 .5. Furthermore, Zitrin et al. ( 2015 ) reported a Ly α emitter at z = 8 .68, and Bunker et al. ( 2023 ) disco v ered an LAE at an even more distant redshift of z = 10 .60.In the context of our simulations, particularly the fiducial model, we observe that up to z ∼ 8, haloes with masses e xceeding 10 9 M h −1 hav e a 50 per cent chance of residing within ionized bubbles large enough to facilitate the escape of Ly α emissions without encountering a large optical depth to resonant scattering.This probability diminishes sharply past z ∼ 9, challenging the viability of our fiducial reionization history in light of the recent JWST observations of LAEs at z 9. To address this, our 'Early' and 'Extremely Early' have reionization histories where ionized bubbles of sufficient size exist at z ∼ 10 .50, to allow for a high transmissivity of Ly α emission.Assessing this more quantitatively will need larger observed samples and more detailed modelling of the Ly α emission and will depend in particular on the redshift of the Ly α emission before it is scattered by the circumgalactic medium (CGM) and the IGM (see Sadoun, a recent summary of the observed properties of Ly α emitters at z ∼ 5 -6 and a discussion of implications for Ly α transmissivity at earlier redshifts).The modelling will be further complicated by an possible increased contribution from AGN to the Ly α emission (Maiolino et al. 2023 ;Scholtz et al. 2023 ).
In summary, the expected strong biasing of ionizing sources and the implied tendency to reside early on in rather large ionized bubbles helps to explain the presence of bright LAEs at redshifts where the Universe is still predominantly neutral.Ho we ver, we also note again that the 'Extremely Early' model exhibits a τ CMB which is in > 3 σ in tension with measurements reported by Planck Collaboration VI ( 2020 ).

C O N C L U S I O N S
In this paper, we have studied a range of reionization histories consistent with the observed mean Ly α transmission and the Ly α opacity distributions in the E-XQR-30 sample at z > 5. We perform cosmological simulations post-processed for radiative transfer using ATON-HE , an updated version of the ATON code that includes helium.
We calibrate multiple radiative transfer simulations of reionization to the mean Ly α transmission F reported in the new measurements of the Ly α opacity by Bosman et al. ( 2022 ).The new Ly α opacity measurements are consistent with previous studies, but have significant lower uncertainties with CDFs that show a somewhat less pronounced tail at high optical depth.We notice that the values of the neutral hydrogen fraction that set the mean flux F are different from the values that dominate the v olume-a veraged neutral hydrogen fraction, x HI v .Consequently, models calibrated to the F measurements reported by Bosman et al. ( 2022 ) can still have some difference in x HI v at these redshifts.The less pronounced tail at high optical depth thereby appears to require a somewhat smaller volume filling factor of remaining neutral islands than was suggested by previous measurements of the Ly α opacity.
Relative to our previous monofrequency simulations, multifrequency simulations calibrated to the mean Ly α transmission F show expected changes.The gas temperature at mean density in these simulations is higher than in their monofrequency counterparts.The required photon emissivity is lower.There is no significant change in the evolution of the volume-averaged neutral hydrogen fraction, x HI v , or the mean free path.Next, adding helium to our simulations and continuing to enforce calibration to the mean Ly α transmission by adjusting the ionizing volume emissivity somewhat reduces the gas temperature for a given source spectrum.At the same time the required ionizing volume emissivity increases to provide the necessary photons to ionize He I to He II .The mean free path also decreases by a small amount.For accurate quantitative inference on the temperature evolution of the IGM and how it relates to source SEDs especially if harder sources like QSOs are involved and/or if one is explicitly interested in the ionization state of helium, our modifications of ATON are obviously needed.If one is mainly interested in how hydrogen reionization proceeds, hydrogen-only, mono-frequency simulations require significantly fewer resources but as discussed have obvious shortcomings.
We also study the effect of the ionizing source model on reionization.In our fiducial model the ionizing photon emissivity Ṅ ion of a source is proportional to its total halo mass M. We consider a Democratic model in which Ṅ ion is independent of M, and an Oligarchic model in which Ṅ ion is also proportional to M but is non-zero only in haloes with the highest masses.These three models display interesting differences that can be interpreted as resulting from the correlation between the IGM density and halo locations.In order to reproduce the observed mean Ly α flux F , the Oligarchic model requires a somewhat smaller emissivity while the Democratic model requires a greater emissivity relative to the fiducial model.In the Oligarchic model, reionization starts slightly earlier than in the fiducial model and ends slightly later.In the Democratic model, the early stages of reionization are very similar to those in the fiducial model, but the end of reionization occurs also slightly later than in the fiducial model.All three models are broadly consistent with the distribution of the Ly α opacity and o v erall the differences in this regard are surprisingly small.
The Ly α opacity measurements from the Ly α forest data still leave substantial freedom for the evolution of the reionization in its early stages.We have considered here two model variations to investigate this.While our fiducial simulation has its midpoint of reionization at z = 6 .5, our 'Early' reionization model has its midpoint at z = 7 .5 and our 'Extremely Early' reionization model has its midpoint at z = 9 .5. All three of these models are calibrated to the observed mean Ly α transmission.They are also broadly in agreement with the measured Ly α opacity distributions.The 'Early' and 'Extremely Early' models thereby complete reionization slightly earlier than the fiducial model.Note, ho we ver, that these dif ferences are again small.The redshift at which x HI v = 0 . 1 per cent varies between z = 5 . 2 and 5.4.
We have quantified the size of ionized 'bubbles' in our simulations using an extension to the connected-CLA.As expected the fraction of the simulation volume occupied by bubbles of a minimum size tracks the progress of reionization.The spatial distribution of haloes with masses able to host bright galaxies at high redshift ( M > 10 10 M h −1 ) is highly biased.In all our models, the fraction of such haloes located in bubbles with sizes greater than 0.5 pMpc rapidly approaches unity in the early stages of reionization.The fraction of such haloes located in bubbles with sizes greater than 1 and 3 pMpc also rises significantly earlier than the v olume-a veraged neutral fraction.Ho we ver, only in our 'Extremely Early' model this happens already at z 10.In particular our fiducial model and perhaps also our 'Early' model might thus be difficult to reconcile with the recent JWST detections of abundant Ly α emission from galaxies up to z ∼ 10 and beyond.As sample sizes increase, the occurrence of Ly α emission at the highest redshifts should become an increasingly important constraint for inferring the early stages of reionization from JWST data and may already now be in (mild) tension with the lower end of values for the Thomson optical depth suggested by the Planck data.

AC K N OW L E D G E M E N T S
We thank Prakash Gaikwad, Vid Ir ši č, Margherita Molaro, Ewald Puchwein, and Sindhu Satya v olu for helpful discussions.We also thank the referee, Nick Y. Gnedin, for their very helpful comments.This w ork w as performed using resources provided by the

A P P E N D I X A : R A D I AT I V E T R A N S F E R A N D T H E R M O C H E M I S T RY I N AT O N -HE
This section co v ers the ke y features of the original code, as well as the modifications we implemented.We followed the description in Aubert & Teyssier ( 2008 ) and Rosdahl et al. ( 2013 ).

A1 Moments of the radiati v e transfer equation
The radiative transfer equation in comoving coordinates is given by where the second term describes the propagation of radiation while the 1 /a factor takes into account the cosmic expansion.The third term is the effect due to cosmological redshifting, and the last term describes the dilution of radiation (Gnedin & Ostriker 1997 ).The quantity I ν ( x , n , t) is specific intensity of the radiation, κ ν ( x , n , t) is the absorption coefficient, and η ν ( x , n , t) is the source function.
Note that a em is approximately 1 for the short photon crossing time and therefore we can write, Before moving on to the moment equations, we define the radiation energy density (Rosdahl & Teyssier 2015 ) as We define the radiation flux as ) and the radiation pressure as For the zeroth-order moment equation we integrate equation ( A2 ) o v er the solid angle.All angular integrations are over the whole sphere.This leads to Using the definition of the radiation energy density (equation A3 ), we get Substituting equation ( A4 ), ) and dividing by the photon energy we get,   The ionized hydrogen fraction decreases monotonically as we mo v e further away from the centre.Similar behaviour is observed for of doubly ionized helium, although the extent of ionization differs due to the slightly higher recombination rates of He III and the relati vely lo w flux at the rele v ant energies.For He II , we observe that the region with the highest value of the ionized fraction is not at the centre.This is because of the He III ionization front that starts to develop near the centre of the box.This behaviour of the ionization fractions is seen more clearly in Fig. B2 , where we plot the one-dimensional distribution of the ionized fractions from Fig. B1 , also at t = 100 Myr.Also shown in this figures are the results from Friedrich et al. ( 2012 ) for comparison.The results from the two codes compare well.The residual small differences can be attributed to the differences between the M1 closure technique of ATON-HE and the ray tracing used in C 2 -RAY , as detailed in Iliev et al. ( 2009 ), and the implementation of the on-the-spot approximation by Friedrich et al. ( 2012 ).
We did a series of convergence tests to examine the effects of the speed of light, number of bins, and spatial resolution, on the abo v e results.Fig. B3 shows the results of these tests.We see that our chosen values of these three parameters give converged results.

A P P E N D I X C : A C C U R A C Y R E Q U I R E M E N T S O N F L U X C A L I B R AT I O N
Our simulations are calibrated to mean Ly α transmission measurements.Ho we ver, e ven in the calibrated runs, the agreement between the simulated Ly α transmission with its observed value is not perfect.This is often because manually finding the required emissivity evolution is challenging.But inaccuracies can also arise because the simulation boxes are not al w ays available at the exact redshift values corresponding to the observations.One can therefore ask what the acceptable level of agreement should be between the simulated and observed values of the mean Ly α transmission.We investigate this in Fig. C1 .Panel (a) of these figures show the simulated mean Ly α flux from 10 runs.The orange curve is our chosen calibration, while the blue curves represent the range of the other calibrations.All simulations are consistent with the data given the observational uncertainties.But, as panel (b) shows, the mean flux in the runs are different up to about 10 per cent.Panels (c) and (d) show the CDFs of the ef fecti ve Ly α opacity for the dif ferent calibrations at the redshifts where the differences in mean flux are largest.The measured CDFs are also shown for context.We see that the 10 per cent differences in the mean flux do not magnify to large differences in the CDFs.  .Panel (a) illustrates how the mean flux varies for consistent calibrations when the total volume emissivity is adjusted.The orange line is for our fiducial calibration, while the blue curves show possible alternative calibrations.Adjustments to the emissivity were made to ensure that the simulation results agree closely with the observed mean flux, as indicated by the black markers, based on (Bosman et al. 2022 ).Panel (b) shows the differences in the mean flux δ F of the fiducial calibrated simulation (orange) and the rest of the simulations (the blue curves in Panel a).At the rele v ant redshifts, the largest differences between the different calibrations is around 10 per cent.Panels (c) and (d) show the distribution of CDFs of the effective optical depth at the two redshifts where the differences in the mean flux are largest.The orange curves show the median CDF for the fiducial calibration, while the blue curves show the range of median CDFs for the alternative calibrations.A variation of approximately 10 per cent in mean flux values corresponds to similar differences in the CDFs.

A P P E N D I X D : T H E R M A L E VO L U T I O N C O N V E R G E N C E
Fig. D1 illustrates the variation of temperature at mean density in an ATON-HE simulation with box size 20 cMpc /h and a 512 3 grid.The source spectrum is black-body with temperature 4 × 10 4 K. Different curves show the simulation result when all that is changed is the number of energy bins (See Section 2.2.2 for a detailed description for frequency binning).We observe that as the number of frequency bins increases, the temperature suddenly rises by approximately 20 per cent upon the addition of the fourth bin, after which it stabilizes around that value.This is because with just three bins, the part of the source spectrum with energies greater than the He II ionizing potential is not sampled well enough.With number of bins greater than four, ho we ver, we see a convergence of about 5 per cent in the temperature computation.This moti v ates the use of four photon energy bins used in this paper.This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 1 .
Figure 1.The total halo mass function in our simulation.The dashed line shows the total halo mass cut-off at 10 9 M h −1 that we impose in our fiducial source model, as the simulation struggles to resolve haloes with mass below this value.

Figure 2 .
Figure2.The black-body spectrum for two values of temperature.These spectra are used as source SEDs in our simulations.Shaded regions indicate our chosen frequency bins, each represented by a distinct colour.The highest energy bin extends to infinity.Dotted curves within each shaded region represent the average energy value for the corresponding bin.We have also marked the ionization energies for H I , He I , and He II .

.
The spectra of sources in simulations were modelled to be black-body with temperatures of 4 .minimum mass of the haloes capable of producing ionizing photons is represented by M min .The mid-point of reionization is represented by z mid .The scaling of the ionizing photon emissivity Ṅ ion of the sources with halo mass M

Figure 4 .
Figure 4. Panels (a-k) show the CDFs of the effective Ly α opacity τ eff of the IGM computed over 50 cMpc /h sections of the Ly α forest in the ten simulations presented in this paper.Each of these ten panels shows two CDFs.The CDF with smaller average values of τ eff corresponds to 5 .36 < z < 5 .51, and the other CDF corresponds to 5 .67 < z < 5 .83.These shaded regions denote the 15.87-84.13percentiles of 10 000 realizations of the CDF, each created using the same number of LOS as in the observed sample, i.e. 65 and 48, respectiv ely, measured o v er 50 cMpc h −1 se gments of the Ly α forest.The black shaded regions in each of the panels show measurements by Bosman et al. ( 2022 ).Panel (l) compares measurements by Bosman et al. ( 2018 ), Eilers et al. ( 2018 ), and Bosman et al. ( 2022 ).

Figure 6 .
Figure 6.The distribution of neutral hydrogen fraction x HI , shown in a two-dimensional slice from the neutral hydrogen distribution in the simulation box, in our fiducial ('Helium-40') simulation (left panel) and the simulations with the 'Democratic' and 'Oligarchic' source models (central and right panels).

Figure 7 .
Figure 7. Lightcones of the neutral hydrogen fraction for our fiducial (panel a), the 'Early' (panel b), and 'Extremely Early' (panel c) reionization histories.

Figure 8 .
Figure 8. Evolution of the CDFs of the ef fecti ve Ly α opacity in the fiducial (blue), 'Early' reionization (red), and 'Extremely Early' reionization (yellow) models in narrow redshift bins from z ∼ 6 to 5. Also shown are the CDFs measured by Bosman et al. ( 2022 ).Each panel shows the number of lines of sight n LOS observed by Bosman et al. ( 2022 ) in that redshift bin.The simulated samples are chosen have the same size as the observed sample in the respective redshift bin.Shaded regions show the range between the 15.87 and 84.13 percentiles.

Figure 9 .
Figure 9.An illustration of how ionized hydrogen bubbles are identified in our simulations.The left panel shows a small, 10 Mpc h −1 wide two-dimensional slice from the neutral hydrogen distribution in the simulation box.This information is converted into a binary, black-and-white map in the central panel, with ionized regions ( x H I < 0 .5) shown in white and neutral regions ( x H I > 0 .5) shown in black.In the right panel, a friends-of-friends algorithm now identifies distinct bubbles.Each bubble is numbered.In this example, we see that there are seven bubbles, taking into account the periodic boundary conditions.

Figure 10 .
Figure10.Panel (a)  shows the fraction of volume in ionized bubbles ( x HI < 0 .5) with radius greater than 0.5 pMpc (dotted curves), 1 pMpc (solid curves), and 3 pMpc (dashed curves) in our fiducial, 'Early', and 'Extremely Early' reionization models.For these three bubble sizes, Panel (b) shows the fraction of haloes with mass greater than 10 10 M h −1 that are located in the ionized regions in these three models.Panel (c) focuses on our fiducial model, and a minimum bubble radius R min of 1 pMpc, for which this panel shows the number of haloes with mass greater than 10 9 , 10 10 , 10 11 , and 10 12 M h −1 that are located in the ionized regions.
Zheng & Miralda-Escud é 2017 and Weinberger et al. 2018 for a discussion of the combined effect of CGM and IGM on Ly α transmissivity and Tang et al. 2024 for MNRAS 533, 2843-2866 (2024)

Figure B3 .
Figure B3.Effect of reduced speed of light (left panel), number of frequency bins (middle panel), and spatial resolution (right on the ionized hydrogen fraction in the Str ömgren test.The box size is 38.4 kpc in all panels.In the left panel, we use a uniform 256 3 grid.In the central panel, the grid is 16 3 .The speed of light for the centre and right panels is 1 × 10 −2 c. Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 MNRAS 533, 2843-2866 (2024)

Figure C1
Figure C1.Panel (a)  illustrates how the mean flux varies for consistent calibrations when the total volume emissivity is adjusted.The orange line is for our fiducial calibration, while the blue curves show possible alternative calibrations.Adjustments to the emissivity were made to ensure that the simulation results agree closely with the observed mean flux, as indicated by the black markers, based on(Bosman et al. 2022 ).Panel (b) shows the differences in the mean flux δ F of the fiducial calibrated simulation (orange) and the rest of the simulations (the blue curves in Panel a).At the rele v ant redshifts, the largest differences between the different calibrations is around 10 per cent.Panels (c) and (d) show the distribution of CDFs of the effective optical depth at the two redshifts where the differences in the mean flux are largest.The orange curves show the median CDF for the fiducial calibration, while the blue curves show the range of median CDFs for the alternative calibrations.A variation of approximately 10 per cent in mean flux values corresponds to similar differences in the CDFs.

Figure D1 .
Figure D1.Top panel shows the evolution of the gas temperature at mean density for test simulations with a box size of 20 cMpc h −1 and a 512 3 uniform grid.The number of frequency bins range from one to ten.The bottom panel displays the ratio with the monofrequency result.

Table 1 .
Summary of simulations considered in this paper.The simulation box size is 160 cMpc h −1 Cambridge Service for Data Driven Discovery (CSD3) operated by the University of Cambridge Research Computing Service ( www.csd3.cam.ac.uk), provided by Dell EMC and Intel using Tier-2 funding from the Engineering and Physical Sciences Research Council (capital grant EP/T022159/1), and DiRAC funding from the Science and Technology Facilities Council ( www.dirac.ac.uk).The project was also supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID s1114.Support by ERC Advanced Grant 320596 'The Emergence of Structure During the Epoch of Reionization' is gratefully acknowledged.MGH has been supported by STFC consolidated grants ST/N000927/1 and ST/S000623/1.GK gratefully acknowledges support by the Max Downloaded from https://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 MNRAS 533, 2843-2866 (2024) Planck Society via a partner group grant.GK is also partly supported by the Department of Atomic Energy (Go v ernment of India) research project with Project Identification Number RTI 4002.Part of the work has been performed as part of the DAE-STFC collaboration Building Indo-UK collaborations towards the Square Kilometre Array' (STFC grant reference ST/Y004191/1).SA also thanks the Science and Technology Facilities Council for a PhD studentship (STFC grant reference ST/W507362/1), and the University of Cambridge for providing a UKRI International Fees Bursary.
(Hui & Gnedin 1997 )s://academic.oup.com/mnras/article/533/3/2843/7733916 by University of California, Santa Barbara user on 10 September 2024 = n H II + n He II + 2 n He III and α A , α B are the case A and case B recombination coefficients.Discretization of the number density, and solving it implicitly leads to The value of the He II is x He II = numerator / denominator .Solving the evolution of He III implicitly we get, He is the number density of helium, and n e is the number density of electrons.The 'Cool' is the total cooling rate defined as Cool = ζ H I n e n H I + ζ He I n e n He I + ζ He II n e n He II + η A H II n e n H II + η A He II n e n He II + η A He III n e n He III + ψ H I n e n H I + ψ He II n e n He II + θn e ( n H II + n He II + 4 n He III ) + ω He II n e n He II + n e .(A40)Thetemperaturecoefficientsaredefined in Section A11 .The denom-We use the following values of rate coefficients for case A and case B recombination(Hui & Gnedin 1997 ):α A H II = 1 .269×10 −13 cm 3 s −1 + ( λ H I / 0 .522) 0 47 1 . 92 (A42) α A He II = 3 × 10 −14 cm 3 s −1 λ 0 . 654 I , (A43)α A He III = 2 .538× 10 −13 cm 3 s −1 He II / 0 .522)0 .47 ] 1 . 923 A44)α B H II = 2 .753 × 10 −14 cm 3 s −1 ij ) , (A39) where n tot = n H + n He + n e , n H is the number density of hydrogen, n I [1 + ( λ H I / 2 .74) 0 . 407] 2 . 242, (A45)