New insight on the nature of cosmic reionizers from the CEERS survey

The Epoch of Reionization (EoR) began when galaxies grew in abundance and luminosity, so their escaping Lyman continuum (LyC) radiation started ionizing the surrounding neutral intergalactic medium (IGM). Despite signiﬁcant recent progress, the nature and role of cosmic reionizers are still unclear: in order to deﬁne them, it would be necessary to directly measure their LyC escape fraction ( f esc ). However, this is impossible during the EoR due to the opacity of the IGM. Consequently, many e ﬀ orts at low and intermediate redshift have been made to determine measurable indirect indicators in high-redshift galaxies so that their f esc can be predicted. This work presents the analysis of the indirect indicators of 62 spectroscopically conﬁrmed star-forming galaxies at 6 ≤ z ≤ 9 from the Cosmic Evolution Early Release Science (CEERS) survey, combined with 12 sources with public data from other JWST-ERS campaigns. From the NIRCam and NIRSpec observations, we measured their physical and spectroscopic properties. We discovered that on average 6 < z < 9 star-forming galaxies are compact in the rest-frame UV ( r e ∼ 0 . 4kpc), are blue sources (UV-β slope ∼− 2 . 17), and have a predicted f esc of about 0.13. A comparison of our results to models and predictions as well as an estimation of the ionizing budget suggests that low-mass galaxies with UV magnitudes fainter than M 1500 = − 18 that we currently do not characterize with JWST observations probably played a key role in the process of reionization.


Introduction
The Epoch of Reionization (EoR) is a period in the history of the Universe, occurring roughly during its first billion years, when the hydrogen in the intergalactic medium (IGM) transitioned from a nearly completely neutral to a nearly completely ionized state.This transition was driven by the Lyman continuum (LyC; λ < 912 Å) radiation emitted by the first luminous sources that formed in the early Universe.However these sources, i.e. the socalled cosmic reionizers, remain elusive: star-forming galaxies can only account for the photon budget to complete reionization if a substantial fraction of the Ultra-Violet (UV) photons produced by their stellar populations escape from the galaxies' interstellar medium (ISM) and circumgalactic medium (CGM).As a result of the density of star-forming galaxies in the EoR, an average LyC escape fraction ( f esc ) of 10% across all galaxies is needed (e.g., Yung et al. 2020a,b;Finkelstein et al. 2019;Robertson et al. 2015) to reionize the Universe by z = 6, and match the Thomson optical depth of electron scattering in the Cosmic Microwave Background (CMB) (Planck Collaboration et al. 2020).At z ≥ 4.5, however, it is impossible to detect the LyC photons escaping from galaxies, since they are absorbed and scattered by the IGM along the line of sight (Inoue et al. 2014), and the LyC can only be detected at low and intermediate redshift (e.g., Flury et al. 2022a;Izotov et al. 2016aIzotov et al. ,b, 2018a,b;,b;Wang et al. 2019;Steidel et al. 2018;Fletcher et al. 2019;Vanzella et al. ⋆ E-mail: sara.mascia@inaf.it2018, 2020;Marques-Chaves et al. 2021, 2022).To overcome this problem, key properties of the ISM and conditions that facilitate LyC photons escape (the so-called indirect indicators) at lower redshifts have been identified (see Flury et al. 2022a, for a review) and used to infer the average f esc of the cosmic reionizers (e.g., Jung et al. 2023;Mascia et al. 2023;Roy et al. 2023;Saxena et al. 2023).
The relative importance of massive and low-mass galaxies in driving reionization is still a matter of great debate as it is intrinsically related to the timeline and topology of reionization.It is expected that reionization starts earlier, and perhaps proceeds in a spatially more homogeneous manner, when faint and low-mass galaxies with a higher f esc dominate ionizing photon budgets over bright galaxies (e.g., Ferrara & Loeb 2013;Finkelstein et al. 2019;Dayal et al. 2020).Conversely, a relatively delayed reionization process is predicted when the contributions from faint galaxies (M 1500 ≥ −18) are subdominant to that from brighter systems (Robertson et al. 2015;Naidu et al. 2020).While both types of galaxies are likely to contribute to the ionizing budget, the balance and interplay between them remain uncertain.To understand the role of faint and bright sources, we need to determine what is their relative contribution to the total ionizing emissivity (ṅ ion ), i.e., the number of ionizing photons emitted per unit time and comoving volume (see Robertson 2022, for a detailed review) which is commonly expressed as: in which ξ ion is the ionizing photon production efficiency, i.e. the number of produced ionizing photons per UV luminosity density, the ρ UV is the integral of the UV luminosity function (the number of galaxies per UV luminosity per comoving volume), and f esc is the fraction of ionizing photons that reaches the IGM.
In the above equation, the ρ UV of galaxies is relatively wellconstrained up to the very high-redshift Universe (e.g.Bouwens et al. 2015Bouwens et al. , 2021;;Donnan et al. 2023).We know that many factors influence the photon production efficiency, including the initial mass function, the stellar metallicity, the evolution of individual stars, and possible stellar binary interactions (e.g., Zackrisson et al. 2011Zackrisson et al. , 2013Zackrisson et al. , 2017;;Eldridge et al. 2017;Stanway & Eldridge 2018, 2019).A commonly accepted value is log ξ ion = 25.3 but many recent observations at intermediate and high redshifts (e.g., Matthee et al. 2017;Izotov et al. 2017;Nakajima et al. 2018;Shivaei et al. 2018;Lam et al. 2019;Bouwens et al. 2016;Stark et al. 2015Stark et al. , 2017;;Atek et al. 2022;Castellano et al. 2022Castellano et al. , 2023)).Yung et al. (2020b) demonstrated that ξ ion can vary quite widely as a function of galaxy properties, and a fixed value is just not sufficient to properly capture the scatter in a large population of galaxies.With the James Webb Space Telescope (JWST), we are now able to measure ξ ion from the rest-frame optical lines (e.g., Schaerer et al. 2016;Shivaei et al. 2018;Chevallard et al. 2013;Tang et al. 2019a), instead of adapting the same value for the entire galaxy population.The only remaining big uncertainty in the emissivity equation is thus the escape fraction and how it varies with M 1500 (or stellar mass), which is the subject of this work.
In Mascia et al. (2023) (M23 hereafter), we have shown that at the end of reionization (4.5 ≤ z ≤ 8), star-forming galaxies are often compact (r e ≃ 0.2−0.5 kpc), and with blue UV slopes (median β = −2.08).Moreover, the analyzed sources present properties (in terms of the [O iii]λλ4959, 5007/[Oii]λ3727 line ratios, O32 hereafter, Hβ rest-frame equivalent widths, EW 0 (Hβ), UVβ slopes, r e , and Σ S FR ) consistent with those of low-z galaxies with measured f esc larger than 0.05.These results suggested that the average low mass galaxies around the EoR have physical and spectroscopic properties consistent with moderate escape of ionizing photons ( f esc = 0.1 − 0.2), resulting in a dominance of low-mass, faint galaxies during cosmic reionization.The results of M23 may clarify the role of faint galaxies during reionization, but were based on a very limited sample of sources.In this work we use the JWST/Near InfraRed Spectrograph (NIR-Spec) and Near InfraRed Camera (NIRCam) observations from the Cosmic Evolution Early Release Science (CEERS) survey of a much larger sample of high redshift galaxies to probe their role as cosmic reionizers during the EoR and put the conclusions of M23 on firmer grounds.This paper is organized as follows: we present the data set in Sec. 2. We characterize the selected sample in Sec. 3, and compare the physical and spectroscopic properties with models in Sec. 4. In Sec. 5 we estimate the total ionizing budget from our sample and discuss our results, while in Sec.6 we summarize our key conclusions.Throughout this work, we assume a flat ΛCDM cosmology with H 0 = 67.7 km s −1 Mpc −1 and Ω m = 0.307 (Planck Collaboration et al. 2020) and the Chabrier (2003) initial mass function.All magnitudes are expressed in the AB system (Oke & Gunn 1983).

CEERS-JWST data
We used JWST/NIRSpec observations from the Cosmic Evolution Early Release Science survey (CEERS; ERS 1345, PI: S. Finkelstein) in the CANDELS Extended Groth Strip (EGS) field (Grogin et al. 2011;Koekemoer et al. 2011).The final list of targets selected for spectroscopic observations during the CEERS program and the way in which targets have been prioritized will be presented in Finkelstein et al. (in prep, see also Finkelstein et al. 2022a,b), while the NIRSpec data will be described in Arrabal Haro et al. (in prep.), see also Arrabal Haro et al. (2023).We also use the CEERS NIRCam imaging in six broadband filters (F115W, F150W, F200W, F277W, F356W and F444W) and one medium-band filter (F410M) over 10 pointings.Details on imaging data reduction and analysis are presented in Bagley et al. (2023) (see also Finkelstein et al. 2022a,b).
In this section, we provide a brief summary, highlighting the most relevant points and explaining the methods we used to study the properties of the galaxies of our sample.
The focus of this study is on all sources at 6 ≤ z ≤ 9. We selected all the sources with a photometric redshift higher than 5 that have a NIRSpec spectrum obtained either with the three medium-resolution (R ≈ 1000) grating spectral configurations (G140M/F100LP, G235M/F170LP and G395M/F290LP), which, together, cover wavelengths between 0.7-5.1 µm, or with the PRISM/CLEAR configuration, which provides continuous wavelength coverage of 0.6-5.3µm with spectral resolution ranging from R ∼ 30 to 300.We visually examined all these spectra for detectable optical lines and measured the systemic redshifts of 70 sources in the chosen range, using the Hβ, [O iii]λλ4959, 5007, and (when present) Hα lines.The best redshift solution was determined by fitting single Gaussian functions to the strongest emission lines and combining the centroids of the fits.In 66 cases, the [Oii]λλ3727, 3729, [Oiii] and/or Hβ were detected and their line fluxes were measured.For the remaining 4 cases, the redshifts were obtained by fitting the Hα line alone, so they are formally included in our sample but they can not be used for further analysis since this is the only line present in the spectra.For this part of our analysis, we use Mpfit 1 (Markwardt 2009).Note that with the PRISM's resolution of R > 140 at λ > 3.4 µm, we are able to discern Hβ from [Oiii], and resolve the [Oiii] doublet but we do not resolve the Hα + [Nii] doublet.

Data from other programs
Several additional public sources are used to expand our EoR sample.In M23, we examined a sample of sources observed from the GLASS-ERS program (PID 1324, PI: T. Treu) using three high-resolution (R ∼ 2000-3000) spectral configurations  (G140H/F100LP, G235H/F170LP, and G395H/F290LP).For the purpose of this work we specifically selected the 7 sources at z spec > 6 (GLASS-JWST IDs: 10000, 10021, 100001, 100003, 100005, 150008, 400009), along with 2 additional sources at z spec > 6 from a DDT program (PID 2756, PI: W. Chen), which were obtained using the PRISM/CLEAR configuration (DDT IDs: 10025, 100004).All these sources are located in the Abell 2744 cluster field.
From the spectroscopic redshift catalogue by Noirot et al. (2023), we selected 4 more sources from the Early Release Observations (ERO) program on the galaxy cluster SMACS J0723.3-7327 at z spec > 6 (ERO IDs: 4590, 5144, 6355, 10612).These spectra were acquired with medium resolution spectral configurations (G235M and G395M).The properties we use in this work were derived from Trussler et al. (2022) and Schaerer et al. (2022).For all the above sources, IDs, coordinates, spectroscopic redshifts, spectroscopic, and physical properties are reported in Appendix 1, Table 2.

Measurements of physical parameters
We measured the physical parameters of the CEERS sample as described in Santini et al. (2023), by fitting synthetic stellar templates with zphot (Fontana et al. 2000) to the seven-band NIR-Cam photometry (Finkelstein et al. 2023, for the sources marked with * in Table 1) and the released HST photometry (Stefanon et al. 2017).Specifically we measured the stellar masses M ⋆,obs , the observed absolute UV magnitudes at 1500Å (M 1500,obs ), the dust reddening E(B − V) and the ages.We adopted Bruzual & Charlot (2003) models and assumed delayed exponentially declining star formation histories -SFH(t)∝ (t 2 /τ) • exp(−t/τ)with τ ranging from 0.1 to 7 Gyr.The age ranges from 10 Myr to the age of the Universe at each galaxy redshift, while metallicity can assume values of 0.02, 0.2 or 1 times Solar metallicity.For the dust extinction, we used the Calzetti et al. (2000) law with E(B − V) which can assume values ranging from 0, 0.03, 0.06, 0.1, 0.15, and from 0.2 to 1.1 in step of 0.1.We computed 1σ uncertainties on the physical parameters by retaining, for each object, the minimum and maximum fitted masses among all the solutions with a probability P(χ 2 ) > 32% of being correct, fixing the redshift to the best-fit value.In Fig. 1 we present the M 1500,obs distribution of the CEERS sources in our sample, which ranges from −22 to −18 AB mag.For reference, we also show the distribution of the M 1500 for the GLASS and ERO sources we are considering in this work.

Dust correction and emission line flux measurements
We measured the total flux of each detected line (Balmer lines, [Oii], and [Oiii]) with a single Gaussian fit.From the flux measurement we subtracted a constant continuum emission, which is estimated from a wavelength region adjacent (±160Å) to the emission line.When the continuum was not well constrained (signal-to-noise ratio S/N < 2) from the fit, we estimated it subtracting the line contribution to the F444W photometry, following Fujimoto et al. (2023).When the S/N of [Oii], [Oiii], or Hβ was less than 2, we set 2σ as an upper limit.
Prior to carrying out a quantitative analysis, it is necessary to consider corrections for dust reddening.For 28 galaxies, Hα and Hβ are both available and we calculated the correction for dust extinction on the basis of the Balmer decrement, assuming a Calzetti et al. (2000) extinction law and an intrinsic ratio Hα/Hβ = 2.86 (see e.g., Domínguez et al. 2013;Kashino et al. 2013;Price et al. 2014), which is valid for an electron temperature of 10000 K.The nebular E(B − V) determined from the Balmer decrement are in agreement with the stellar reddening determined from the SED fitting.Therefore for the 38 sources in the sample without Hα, we converted their stellar E(B − V) S ED to nebular E(B − V) following Calzetti et al. (2000) and applied the nebular corrections derived from these values.
With the dust corrected spectra, we calculated the O32 line ratios and the [Oiii] and/or Hβ rest-frame EWs.We list all these values in Table 1.Within the errors, our measurements are consistent with those from previous works for sources in common (Jung et al. 2023;Fujimoto et al. 2023;Arrabal Haro et al. 2023;Tang et al. 2023).

UV-β slopes
We measured the UV-β slope of our galaxies from the NIR-Cam photometry and/or the previously available HST photometry (Stefanon et al. 2017), with the approach detailed in Calabrò et al. (2021).We considered all the photometric bands whose entire bandwidths are between 1216 and 3000 Å rest frame.The former limit is set to exclude the Lyα line and Ly-break, while the latter limit is slightly larger than that adopted in Calabrò et al. (2021) to ensure that we can use more bands.
We then fitted the selected photometry with a single powerlaw of the form f(λ) ∝ λ β (Calzetti et al. 1994;Meurer et al. 1999).In practice, we fitted the available photometric bands amongst HST F125W, F140W, F160W and JWST-NIRCam F115W, F150W or F200W depending on the exact redshift of the sources.This choice allows us to uniformly probe the spectral range between 1500 and 3000 Å for most of the galaxies.We measured the β and associated uncertainty for each source using a bootstrap method: by using n = 1000 Monte Carlo simulations, the fluxes in each band were resampled according to their error.The results provided a slope distribution from which we calculated the mean and standard deviation of β for each galaxy.Two of the sources in our sample did not have the necessary data, so we were able to estimate the β slopes only for 64 galaxies.The results on β with associated errors are reported in Table 1.We note that for 5 sources different β slopes are published in literature (Jung et al. 2023;Arrabal Haro et al. 2023), but they were estimated from SED fitting or from the spectra.For 4 out of 5, our values are consistent with the published ones within the uncertainties.
In Fig. 2 we show the relation between our measured β slope values and M 1500 and the observed trend at z = 7 from Bouwens et al. (2014).We also plot the β values as function of M 1500 for the GLASS and ERO sample.Our results are consistent with the best fit relation from Bouwens et al. (2014) although with a large scatter.We must notice that the galaxies with the bluer slopes (with values around -3) that most deviate from the relation also have the largest uncertainties.Overall we confirm the existence of a broad correlation between β and UV magnitude at z ∼ 7 (e.g., Wilkins et al. 2011;Finkelstein et al. 2012;Bouwens et al. 2014;Nanayakkara et al. 2023).Our average β value at z ∼ 7, ⟨β⟩ = −2.17± 0.47, is in good agreement with Dunlop et al. (2013) (⟨β⟩ = −2.1 ± 0.2 at z ∼ 7).

UV half-light radii
We measure the half-light radius r e of each galaxy in the restframe UV using the python software Galight2 (Ding et al. 2020), which adopts a forward-modeling technique to fit a model to the observed luminosity profile of a source.We assume that the galaxies are well represented by a Sersic profile (Sersic 1968).In the fitting process, we constrain the axial ratio q to the range 0.1-1, and we fix the Sersic index n to 1, which is suitable for star-forming galaxies and also adopted by Yang et al. (2022b) and Morishita et al. (2018).This latter choice is consistent with the median value that we find for a subset of sources with higher S/N for which the fit converges to a finite n and r e when leaving all the parameters free (see also Mc Grath et al. in prep.).The uncertainties on the sizes were estimated following Yang et al. (2022b) and re-scaled to the S/N from the photometry.The results obtained with Galight are robust, as shown by previous works (e.g., Kawinwanichakij et al. 2021), and in agreement with those estimated using traditional softwares such as Galfit (Peng et al. 2002).
For 50 galaxies, we used the NIRCam photometry to measure r e in the F150W band (except for ID:542, for which the size is measured in F200W to improve the fit precision), corresponding to the UV rest-frame of the galaxies.For 19 galaxies where only the HST photometry was available, we measured r e using the F160W filter, which has the highest S/N.14 sources have profile resolutions that are likely unresolved, so we place an upper limit (see Calabrò et al. in prep).In cases where additional sources are present in the same cutout of a galaxy, we masked them or fitted them with additional Sersic profiles.We list all these measurements in in Table 1.To determine the minimum size measurable in the F150W band, we followed a similar approach to that recently adopted by Akins et al. (2023) in the F444W band.In brief, we performed a set of simulations by creating mock F150W images of galaxies (as observed by CEERS) with a Sersic profile, different magnitudes (from 25 to 28), and different intrinsic sizes from 0.005 ′′ to 0.1 ′′ , in steps of 0.005.We then applied PSF fitting with Galfit, considering unresolved a source if it is undetected (S/N < 2) in the residual image.This procedure yields a minimum measurable size of 0.025 ′′ (i.e., ∼ 123 pc at redshift 8), which we adopt in this work as a lower limit.We will describe these simulations in more detail in Calabro et al. 2023 (in prep.).As for the galaxies taken from previous works, for the M23 sample r e was measured in the F115W band; for the ERO sample, F200W was considered for the sources at z > 7, and F150W for the galaxy ID:5144 at z = 6.381 (Trussler et al. 2022).Typical sizes of our galaxies range from 0.1 to 2 kpc and are consistent with rest-frame UV r e measured during reionization by recent works (Morishita et al. 2023;Yang et al. 2022b;Shibuya et al. 2015).In Fig. 3 we show the relation between our measured r e and M 1500 .Apart for a few outliers, we recover the well known magnitude-size relation: although with a large scatter, our results are consistent with the relation found at z ∼ 7 derived from HST data by Shibuya et al. (2015), and the relation at z ∼ 6 − 7 from Yang et al. (2022a) based on photometrically selected galaxies lensed by six foreground Hubble Frontier Fields (HFF) clusters.We note that most potential cosmic reionizers should have very small UV rest-frame dimensions (≤ 0.4 kpc), indicating highly concentrated star formation as for example found by Flury et al. (2022b) and in a few intermediate redshift leakers such as Ion1 (Ji et al. 2020).

AGN contamination
While we recognize that AGN may also play some role in reionization, e.g., Madau & Haardt (2015); Smith et al. (2018Smith et al. ( , 2020)), a concern with our current dataset is that any AGN identified here may constitute too small a sample, and might be too heterogeneous to properly evaluate their role in reionization.Therefore we exclude them in the current work, in order for us to provide the most robust measurements of the contribution of galaxies (non-AGN) to reionization, while the role of AGN is deferred to future studies with more suitable samples.We first visually Article number, page 5 of 14 examined all spectra to see if there were any broad lines in them.Then, we employed the optical rest-frame spectroscopic diagnostics to distinguish between star-forming galaxies and AGNs.Most of our sources from the CEERS program have redshifts higher than 6.7 (7.07), so their Hα and [Nii] emission lines cannot be identified due to the long-wavelength limit of NIRSpec G395M at z ≥ 6.7, and z ≥ 7.07 for the PRISM.In any case, at lower redshift Hα + [Nii] cannot be resolved with the PRISM.For this reason, we employed the mass-excitation (MEx) diagram (Juneau et al. 2011(Juneau et al. , 2014) ) with the division line identified by Coil et al. (2015) for z = 2.3 galaxies and AGN from the MOSDEF survey, as already done in M23.According to the visual inspection and the position of our sources in the MEx diagram, we conclude that our sample contains one AGN (ID: 1019) and 69 star-forming galaxies.The AGN at z = 8.679 was already identified and discussed by Larson et al. (2023).

Evaluating f esc
Assuming that the mechanisms that drive the escape of LyC photons are the same at all redshifts and depend only on the physical properties of the sources, several authors have recently attempted to derive empirical relations between f esc values and other observable and/or physical properties that can be measured also at high redshift.In particular, Lin et al. (2023) have applied the relation with the β slope derived by Chisholm et al. (2022), while Saxena et al. (2023) applied the relation predicted by Choustikov et al. (2023), which relies on the β slope, the E(B − V), the Hβ line luminosity, the M 1500 , the R23, and the O32.In M23 we presented our own empirical relation calibrated on the Flury et al. (2022a) low-redshift Lyman Continuum survey (LzLCS) sample, between f esc and β slope, r e , and logO32 (Eq. 1 in M23).Due to the fact that O32 and EW 0 (Hβ) exhibit a very tight correlation (Spearman correlation between them > 0.9), in M23 we used only one of the two values.However, since in some cases Hβ is measurable while O32 is not, here we also present an alternative relation using r e , β, and the EW 0 (Hβ).This relation can be used when it is not possible to derive O32 due to a lack of one of the two lines.This new relation has the following form: where the values between the parentheses are in the 95th percentile distribution.In Appendix 2 we present an analysis of the residuals between the measured f esc values for the LzLCS sample and those predicted using both relations.Using either Eq. 1 in M23 or Eq. ( 2), we predicted the f esc value for the CEERS 65 star-forming galaxies, in addition to the GLASS+DDT and ERO sources for which we have the β slopes.As already mentioned, the UV half-light radius of 1 source from the CEERS sample could not be determined due to the inability to achieve a good fit of the profile.Moreover, in 2 cases, the β slope could not be measured.Since these quantities appear in both of the proposed equations, we were unable to estimate f esc for 3 sources of the CEERS sample.For the remaining sources, we used the M23 equation in 49 cases in which O32 is measured accurately or it is a limit but Hβ is not evaluated, and the Eq. ( 2) for the other 13 cases.In total, we predict an f esc value for 74 sources from the three samples.Given the uncertainty both on the coefficients of the relations and on the quantities on which f esc depends, we estimate the f esc errors using a bootstrap method.We use n = 1000 Monte Carlo simulations varying both the coefficients and the individual properties within their uncertainty.The results provide an f esc distribution from which we determine the mean f esc and the standard deviation for each galaxy, which is taken as the uncertainty.In Fig. 4 we show two examples of the probability distribution function (PDF) of the f esc values resulting from the above Monte Carlo runs, for a galaxy with modest inferred mean f esc (0.05) and one with a high inferred mean f esc (0.24).In Table 1 we report the mean f esc and the standard deviation for the CEERS galaxies, in Appendix 1, Table 2 we report the same values for the GLASS and ERO sources.
In Fig. 5 we present the distribution of the inferred mean f esc values.Most of our galaxies have modest inferred f esc , of the order of 0.10 or below.The average f esc for our sample (with the standard error of the mean) is 0.13 ± 0.02.This value is affected by the high f esc (0.3 − 0.5) inferred for a handful of sources.The median in this case is a more representative value and it is equal to 0.08 ± 0.02.To evaluate the impact of using the mean f esc for each galaxy instead of the full PDF (which is not gaussian but more lognormal), we produced the same distribution shown in Fig. 5, this time stacking the individual PDF of all galaxies.The resulting distribution essentially unchanged: computing the mean and median values they are respectively 0.11 and 0.08, confirming that our results are robust.

f esc dependencies
In Fig. 6, left panel, we plot the predicted f esc values versus the stellar mass M * .We show average binned values (using a running average) with the shaded area indicating the 1σ uncertainty.We find that low-mass galaxies tend to have slightly higher escape fractions, although the relation is rather scattered.For comparison we also plot the prediction by Rosdahl et al. (2022) based on SPHINX cosmological radiation-hydrodynamical simulations of reionization.Their simulated values of f esc are generally lower than our predictions, well below 0.1 during most of the EoR, although they also find the same dependence on total stellar mass.Since most simulations predict f esc versus halo mass M h relations, we converted our stellar masses M * into halo masses M h following the relation as a function of redshift derived by Behroozi et al. (2019).We plot the M h versus the predicted f esc values in Fig. 6, right panel.We compare our results to the prediction by Rosdahl et al. (2022) (see above) and to those obtained by Ocvirk et al. (2021) and Lewis et al. (2023) using RAMSES-CUDATON radiation-hydrodynamical simulations.These simulations aim at reproducing the observed Lyα opacity distribution.Their predicted f esc are ∼ 1 for very low-mass galaxies and drop at M h ≥ ×10 9.5 M ⊙ .We plot both the fiducial and "permissive" model of Ocvirk et al. (2021), where this second one allows a more permissive recipe for SF also above the temperature of T * = 2 × 10 4 K.In Lewis et al. (2023) the fiducial model of Ocvirk et al. (2021) is extended through the inclusion of a physical model for dust production, coupled to the radiative transfer module.Finally we plot the predictions by Bremer & Dayal (2023) that are based on DELPHI simulations at z = 5 and z = 10.In this work, reionization starts at z ∼ 16, is complete at z = 5.67 and it is dominated by faint, low-mass galaxies with M h ≤ 10 7.8 M ⊙ at z ∼ 15 that show f esc up to 0.7.Most of the above models predict a very rapid increase of f esc with decreasing halo mass, below M h ≃ 10 10 M ⊙ , a range which we barely sample with our observations, and a very low almost null f esc for the more massive halos, at odds with our inferences.In the range of halo mass observed, simulations are more than 1 σ away from our inferred f esc .The strong discrepancy between the f esc values we derive from NIRSpec data and the model predictions could be due to a number of aspects: -It may be that simulations do not adequately capture the bursty nature of star formation.It has been shown that Supernova (SN) feedback plays a critical role in creating regions with higher transparency for LyC escape.As a consequence in the models there is a positive correlation between f esc and the SFR measured over the last 10 million years (Rosdahl et al. 2022).This suggests that bursty star formation contributes to higher f esc values.However accurately quantifying the burstiness of star formation observationally, and comparing it to a simulation's burstiness is a difficult task.For instance, it has been suggested that Hα / FUV fluxes for could help quantifying SFR burstiness observationally (Sparre et al. 2017), but this requires a fairly sophisticated post-treatment of simulations, and is very sensitive to the details of star formation, feedback (SN and radiative) and ISM modelling.Other probes of burstiness have been and will be proposed (Sun et al. 2023), and may offer avenues of progress on this topic.-Another potential reason for the large discrepancy could be the description of the thermodynamical state of the shockheated multi-phase CGM (van de Voort et al. 2015).For ex-  2021), the violet one is the prediction from Lewis et al. (2023), the line ones are the predictions from Bremer & Dayal (2023).
ample, in a case in which a clumpy CGM is composed of hot, highly ionized gas surrounding cold dense clumps, if the cold phase is sufficiently dense and the hot phase has high pressure, the clumps may have a small cross-section: with a small total covering fraction, a high f esc value could be observed.Insufficient spatial resolution in this case would imply artificially larger clumps, leading to a higher covering fraction and reducing the f esc .The complexity of this behavior is being explored in simulations (Gronke & Oh 2020).-Finally we should keep in mind that the relations that we have used to infer the f esc for galaxies in the EoR have been derived and tested using the LzLCs sample that is located at low redshift (z = 0.2 − 0.4).Therefore its applicability to the CEERS sample (6 ≥ z ≥ 9) is not straightforward.The large discrepancy with simulations might be due to an overestimate of the f esc in the EoR.
The permissive model of Ocvirk et al. (2021) is the only one that has average f esc high enough to be comparable to our values.This can be attributed to the permissive run's unique characteristic of permitting star formation in cells with temperatures potentially exceeding 2 × 10 4 K.These higher temperatures inherently lead to greater ionization and increased transparency compared to the fiducial run and hence to larger values of f esc .Interestingly, this model is not the one favored by Ocvirk et al. (2021) as it leads to an overionization of the Lyman-α forest characterized by unrealistically low Lyman-α IGM opacities.
In Fig. 7 we plot our predicted average f esc values versus the UV magnitude M 1500 .We note that Eq. ( 2) and the M23 relation have been derived on the LzLCS which only contains galaxies brighter than M 1500 ≃ −18.5.Therefore for our few faintest objects using the above equations might be an incorrect extrapolation.Our average f esc is almost constant within the observed magnitude range, altough we point out that we might start to be biased at the faintest luminosities (especially for objects with faint emission lines and hence small f esc ) due to the spectroscopic flux limit of the CEERS survey.In the same Figure we also show the predicted f esc vs M 1500 relationship by Lin et al. (2023), who analyzed 3 galaxies at z ≥ 8 behind the cluster RX J2129.4+0009.They developed an empirical model based on the LzLCS program, which first defines for a given galaxy a probability of being a LyC-leaker based on M 1500 , O32 and β and then infers the f esc values from the β slope following Chisholm et al. (2022).They predict for bright galaxies a very flat relation, similar to ours but with f esc values that are about a factor of 2 larger.In addition, they predict that f esc should slowly decrease for galaxies fainter than M 1500 = −19.Essentially this is due to the fact that the probability of faint galaxies of being LCE becomes lower.However they extrapolate this result from the LzLCS, which as already mentioned earlier, contains no sources below M 1500 = −18.5.As a final comparison, we plot the results by Matthee et al. (2022) who produced a semiempirical model based on constraints on the escape fractions of bright LAEs at z ∼ 2. These authors find that f esc peaks between −19 ≤ M 1500 ≤ −20 and then decreases very rapidly at fainter magnitudes (the so-called reionization by the oligarchs).At magnitudes brighter than M 1500 = 19 our average results are consistent with theirs, within the uncertainties, but we do not observe the strong decrease at fainter magnitudes.

Redshift evolution
In Fig. 8 we plot our predicted f esc versus the redshift.We also plot the sources from M23 at redshift lower than 6 which were derived with the same method.The average f esc in the three redshift bins, 5 ≤ z < 6, 6 ≤ z < 7 and 7 ≤ z ≤ 9, are respectively equal to 0.11, 0.12 and 0.14.We therefore observe a slight increase of the average f esc with redshift, although statistically not significant.A similar trend would be observed using the median values.We also show the predicted f esc as function of redshift from Rosdahl et al. (2022), derived from Figure 6 (pred.)running average Lin+23 Matthee+22 z = 6.9 Fig. 7: Predicted f esc vs M 1500 .Symbols are the same as in Fig. 2. The green line shows the running average for our sample, the pink one refers to Matthee et al. (2022) and the yellow one to Lin et al. (2023).
crease of the f esc with redshift which is very similar to what we observe.
In the same plot we also show the sample of Lyα emitters at z = 6 − 8 from the JWST Advanced Deep Extragalactic Survey (JADES) presented in Saxena et al. (2023), which span the same M 1500 range as our sources.They predict the f esc using an equation proposed by the Choustikov et al. (2023), based on the SPHINX simulations which uses six observed galaxies' properties to infer the angle-averaged (and not sight-line dependent) f esc .We see that non-Lyα emitters and Lyα emitters at the same redshift and in the same UV magnitude range do not show significant differences in the predicted f esc , although determined using two different and independent methods.This might be a first indication that in the EoR, when the visibility of Lyα emission is increasingly suppressed by neutral IGM, the Lyα line emerging from the galaxies is not a good indicator of the LyC photons' escape and therefore other indirect indicators are needed.Further investigation of this important issue is in progress and will be presented in follow up paper.

Extreme LyC emitters
We analyzed in more details the 16 sources from our final sample that show an f esc higher than 0.2.The majority of them show an intense O32 or high EW(Hβ) coupled with small r e or very blue β slope.We highlight the fact that an extremely blue β is a very good predictor of a high f esc : 10 out of 17 sources with β > −2.5 have a predicted f esc > 0.2.Indeed Chisholm et al. (2022) identified the β slope as one of the best indirect indicators.However this condition does not seem necessary, since there are several sources that have more average β slopes (i.e., of the order of −2) but for which we predict high f esc because they are both extremely compact and have a high O32 or high EW(Hβ).Similarly of the 8 unresolved sources for which we are able to infer f esc , 5 are extreme leakers ( f esc > 0.2).However we have some leakers with r e larger than 0.5 kpc.Overall, there is not one single property that stands out as more important.This reinforces the idea that more than one indicator is needed to correctly identify the entire population of LyC emitters.

Ionizing photon production efficiency
Direct constraints on ξ ion can be obtained from the measurement of Balmer emission lines luminosity after correcting for dust attenuation (e.g.Schaerer et al. 2016;Shivaei et al. 2018) or from modelling the contribution of these optical emission lines to the broad band measurements when spectroscopic observations are not available (e.g.Bouwens et al. 2016).Unfortunately, the Hα line is outside the observed range of most of our galaxies (see Sec. 3.2) and the Hβ line is also missing from some sources: in addition, there are still some calibration uncertainties on NIRSpec absolute flux (and therefore luminosity).Chevallard et al. (2013) showed that log ξ ion can be measured by using EW([O iii]λλ4959, 5007) (see also Tang et al. 2019b).The [Oiii] lines are clearly detected for all sources, and in addition the EW measurements have less calibration uncertainty compared to the flux.We calculated the log ξ ion values from EW([Oiii]) following the Eq. 3 from Chevallard et al. (2018).We obtain an average ⟨log ξ ion ⟩ = 25.27 ± 0.51, which is consistent with predictions from physical models (Yung et al. 2020b;Wilkins et al. 2016) and slightly lower than other measurements at the EoR.For example, Saxena et al. (2023); Simmonds et al. (2023) estimated log ξ ion from Hα luminosity, finding respectively average values of 25.56 and 25.44 although their samples included Lyα emitters whose photon production efficiency is generally higher, while Castellano et al. (2022);Prieto-Lyon et al. (2022); Endsley et al. (2023), using SED fitting, obtain an average value of log ξ ion of 25.14, 25.33, 25.7 respectively.In Fig. 9 we show the distribution of ξ ion for our sample.We do not find any correlation with the β slope: our best fit is consistent with the average value also shown in the Figure .At variance with this, Prieto-Lyon et al. ( 2022) find a slight dependence on this property for galaxies at z = 3−7, in the sense that bluer star-forming sources tend to have higher photon production efficiencies (see also Castellano et al. 2023).We also do not find any dependence of ξ ion on M 1500 in accordance to what found by Prieto-Lyon et al. (2022); Endsley et al. (2023).Note that the recent results by Atek et al. (2023) indicate a higher ξ ion for much fainter galaxies (M UV ≥ −17) during the EoR.

The ionizing photon production of bright and faint sources
Having derived predictions for f esc and ξ ion for our large sample of galaxies in the EoR, our goal is now to solve Eq. ( 1) and determine the relative contribution of galaxies as a function of M 1500 , to establish which sources contributed most to the total ionizing photon production rate at these epochs.We consider: 1) ρ UV from the Luminosity Function (LF) of Bouwens et al. (2021) at our median redshift ⟨z⟩ ∼ 7.2.The best-fit α slope that characterizes the faint-end of the UV-LF is −2.06 ± 0.03; 2) f esc as a function of M 1500 from the values derived in Fig. 7 between M 1500 -22 and -18 (i.e. the range covered by our observations); we use a fixed value of 0.10 at fainter magnitudes, where we have only few sources, and a value of 0.05 at magnitudes brighter than -22, where we do not have any observed source in our sample; 3) ⟨log ξ ion ⟩ = 25.27, which does not vary with M 1500 , as found in Sec.4.5.We assume a low luminosity cut at M 1500 = −13 and a high luminosity cut of M 1500 = −23 (as in Robertson et al.

2015).
To estimate the total ṅion we proceeded as follows: we first discretized the M 1500 range over [-13,-23] in bins of width 1 mag.
For each of these intervals we calculated ρ UV in the considered magnitude bin and multiplied it by the appropriate ξ ion and f esc .
We then derive the fraction of the total ṅion that is provided by galaxies in each magnitude bin: the results are shown in Fig. 10 and indicate that the galaxies that we can currently characterise with JWST observations are contributing to only a fraction of the total ionizing budget, i.e. less than 35% of the total.We would therefore need to push our observations at 2-3 magnitudes deeper to characterise the bulk of the cosmic ionizers.Note however than in previous studies (e.g., Finkelstein et al. 2019;Robertson et al. 2015) the fraction of ionizing photons from faint galaxies was even more prominent, with the extreme faint end of the luminosity function dominating the ionizing emissivity (e.g., see Figure 8 of Finkelstein et al. 2019).We also show how the results would vary by changing the faint end f esc to a values of 0.05 and 0.15 respectively (right top panels of Fig. 10) and by changing the faint end slope of the UV-ρ UV within the 5σ uncertainties [−1.94, −2.27].We see that the contribution of JWST sources with M 1500 < −18 to the integrated ionizing emissivity becomes 40% if we assume a very small f esc for the faintest galaxies, or an extremely flat LF UV at the faint end, but it is never dominant even in these extreme cases.

Summary and conclusions
In this paper, we have presented an analysis of 70 spectroscopically confirmed star-forming galaxies at 6 ≤ z ≤ 9 from the CEERS survey, combined with 12 sources with public data from other JWST early campaigns.Assuming that the mechanisms that facilitate the escape of LyC photons from galaxies remain consistent across all cosmic epochs, we estimated the f esc of the observed sources employing two empirical relations based on the most promising indirect indicators of this emission identified at z ∼ 0.3, which are also measurable during the EoR.Using the mean inferred f esc as function of M UV and the photon production efficiency derived from the [Oiii] emission line, we have then evaluated the relative role of faint and galaxies and their contribution to the process of reionization.Our main results are the following: -The majority of our sources show modest f esc values, with a mean f esc of 0.13 ± 0.02, and an even lower median of = -2.27Fig. 10: Left: The ṅion fraction of galaxies at 6 ≤ z ≤ 9 as function of the M 1500 .Triangles represent the ṅion obtained from the predicted f esc as a function of M 1500 in each bin.Bars represent the extrapolated value, i.e. the ṅion fraction derived assuming that for M 1500 > −18 a f esc value constant and equal to 0.10, and for M 1500 < −22 a f esc value constant and equal to 0.05.Right: in yellow we show the ṅion fraction of galaxies at 6 ≤ z ≤ 9 as function of M 1500 changing, respectively, from the top left to the bottom right, the f esc values at the faint-end (0.05 and 0.15) and the α parameter of the ρ UV (−1.94 and −2.27).The original result is also shown with the same symbols as in the figure from the left.0.08 ± 0.02.Just 20% of galaxies have f esc > 0.2: the majority of these extreme LyC emitters show an intense O32 or high EW(Hβ) coupled with small r e or very blue β slope.As expected there is no single property that stands out as the best indirect indicator of a high LyC escape.
-The predicted f esc has a modest dependence on the total stellar mass M ⋆ with low mass galaxies tending to have higher mean f esc although the trend is scattered.The relation with M 1500 is less well characterised and there is not a significant dependence.-There is a strong discrepancy between our inferred f esc and those predicted by most cosmological hydrodynamical simulations of reionization, which consistently infer much lower f esc values for galaxies in the same mass range as the one explored by the JWST observations.We discuss potential causes for the discrepancy such as the failure of simulations to fully account for the bursty nature of star formation, or the limited resolution.Alternatively using relations derived from low-redshift samples to infer f esc for galaxies in the EoR might not be correct and could lead to an overestimate of the f esc values.-The average predicted f esc have at most a modest increase with redshift from z = 5 to z = 8 raising from 0.11 to 0.14.-The predicted f esc during the EoR does not show a clear dependence on the presence of Lyα emission.This is actually expected since in the EoR the Lyα emission is modulated also by the IGM opacity in the local surroundings and not just by the galaxies properties as at low redshift; -With the inferred values of f esc and ξ ion we derive a total ionizing emissivity log ṅion = 50.50± 0.38 and 50.75 ± 0.35 s −1 Mpc −3 at redshift 8 and 6 respectively, i.e. comparable to the threshold needed to maintain the Universe ionized.Sources brighter than M 1500 = −18, which are those we can currently characterise with JWST observations, only contribute less than 35% to the total ionizing emissivity.
The findings of this study provide crucial insights into the reionization epoch, primarily focusing on the characterization of relatively bright sources and indicating that galaxies significantly fainter and less massive than those observed by the initial JWST programs, could potentially play a dominant role in the reionization process.To study significantly large samples of such faint galaxies, ultra-deep observations of galaxy cluster fields will be needed since lensing becomes a necessary tool, as in the recent work by Atek et al. (2023) which reaches galaxies as faint as M UV = −15.
In addition further work on the LyC indirect indicators will be needed to validate the fundamental assumption that the physical mechanisms and conditions that facilitate the escape of Lyman continuum photons remain the same over cosmic time.In particular, future work should be aimed at assembling a solid reference sample of Lyman continuum emitting galaxies, analogous in size to the LzLCS survey, but at z = 3 − 4, i.e. the highest redshift where a direct detection of LyC photons is possible and which is much closer in time to the epoch of reionization.If our derived relations to infer f esc were still valid at z=3-4, then we could be much more confident that they can be also applied in the EoR.

Fig. 1 :
Fig.1: M 1500 distribution for the analyzed sources at 6 ≤ z ≤ 9 (grey: total sample; red and blue are respectively the GLASS sample and the ERO sample fromMascia et al. 2023;Noirot et al. 2023).
Fig. 2: β vs. M 1500 .Black triangles: CEERS sources with β slope obtained fitting 3 or 2 photometric bands.Red dots: GLASS sample; blue squares: ERO sample.The green line shows the relation at z ∼ 7 derived from HST data by Bouwens et al. (2014).Dashed portions indicate the extrapolation of the relation in our range of M 1500 .

Fig. 3 :
Fig. 3: Rest-frame UV r e vs. M 1500 .Symbols are the same as in Fig. 2. The green line shows the relation derived from HST data by Shibuya et al. (2015) at z ∼ 7, the blue line shows the size-luminosity relation at z ∼ 6 − 7 from Yang et al. (2022a).

Fig. 4 :Fig. 5 :
Fig.4: PDFs of the f esc values for two sources from the CEERS sample (MSA IDs: 390 and 80925).The considered f esc of the source, which is the mean f esc of the distribution, is shown in red.

Fig. 6 :
Fig.6: Left: predicted f esc vs stellar mass (log 10 M ⋆ ).Symbols are the same as in Fig.2.The green line shows the running average for our sample, while the red one is the prediction at z = 6 − 8 fromRosdahl et al. (2022).Right: predicted f esc vs halo mass (log 10 M h ) estimated withBehroozi et al. (2019) conversion.The blue and orange lines are the models fromOcvirk et al. (2021), the violet one is the prediction fromLewis et al. (2023), the line ones are the predictions fromBremer & Dayal (2023).
of their paper at a median M 1500 = −19.As previously discussed, their f esc values are generally lower than ours, but they predict a slow in-Article number, page 8 of 14 S. Mascia et al.: New insight on the nature of cosmic reionizers from the CEERS survey

<
Fig.8: Predicted f esc as a function of redshift.Symbols are the same as in Fig.2.The magenta points are the average f esc values of our sample at z = 6.5 and z = 8.The green dots are the M23 sources, while the green point is the average f esc value at z = 5.1.The blue hexagons represent the Lyα emitters fromSaxena et al. (2023).The red line is the prediction at fixed M 1500 = −19 fromRosdahl et al. (2022).

Fig. 9 :
Fig.9: log ξ ion vs β.The magenta line shows the mean log ξ ion for our sample.

Table 1 :
Physical and spectroscopic properties for the CEERS sample.