The prevalence of galaxy overdensities around UV-luminous Lyman emitters in the Epoch of Reionization

Before the end of the Epoch of Reionization, the Hydrogen in the Universe was predominantly neutral. This leads to a strong attenuation of Ly α lines of z (cid:2) 6 galaxies in the intergalactic medium. Nevertheless, Ly α has been detected up to very high redshifts ( z ∼ 9) for several especially UV luminous galaxies. Here, we test to what extent the galaxy’s local environment might impact the Ly α transmission of such sources. We present an analysis of dedicated Hubble Space Telescope ( HST ) imaging in the CANDELS/EGS ﬁeld to search for fainter neighbours around three of the most UV luminous and most distant spectroscopically conﬁrmed Ly α emitters: EGS-zs8-1, EGS-zs8-2, and EGSY-z8p7 at z spec = 7.73, 7.48, and 8.68, respectively. We combine the multiwavelength HST imaging with Spitzer data to reliably select z ∼ 7–9 galaxies around the central, UV-luminous sources. In all cases, we ﬁnd a clear enhancement of neighbouring galaxies compared to the expected number in a blank ﬁeld (by a factor ∼ 3–9 × ). Our analysis thus reveals ubiquitous o v erdensities around luminous Ly α emitting sources in the heart of the cosmic reionization epoch. We show that our results are in excellent agreement with expectations from the D RAGONS simulation, conﬁrming the theoretical prediction that the ﬁrst ionized bubbles preferentially formed in o v erdense re gions. While three UV luminous galaxies already have spectroscopic redshifts, the majority of the remaining fainter, surrounding sources are yet to be conﬁrmed via spectroscopy. JWST follo w-up observ ations of the neighbouring galaxies identiﬁed here will thus be needed to conﬁrm their physical association and to map out the ionized regions produced by these sources.


INTRODUCTION
The Epoch of Reionization (EoR) remains one of the key frontiers of extragalactic studies.Understanding exactly when and how reioniza-tion occurred within the first Gyr of cosmic history is a major goal of astronomy over the next decade.While great progress has been made to constrain the overall time frame of reionization through different measurements, including the polarization of the Cosmic Microwave Background (CMB; Planck Collaboration et al. 2020), the onset and duration of the EoR are still uncertain (e.g., Greig & Mesinger 2017;Naidu et al. 2020;Bosman et al. 2021; also see Robertson 2021 for a recent review).
An efficient tool used to further constrain the evolution of the neutral fraction at  > 6 is the Ly line emitted by galaxies (socalled Ly Emitters; LAEs).Due to its resonant nature, Ly is easily scattered and absorbed in the neutral hydrogen in the IGM making LAEs a sensitive probe of the progress of cosmic reionization (e.g., Miralda-Escudé 1998;McQuinn et al. 2007;Dayal et al. 2011;Mason et al. 2018).Indeed, the fraction of continuum selected galaxies that do show Ly has been shown to decrease rapidly when entering the neutral phase of the Universe at  > 6 (e.g., Stark et al. 2010;Fontana et al. 2010;Schenker et al. 2012;Treu et al. 2013;Pentericci et al. 2014;Hoag et al. 2019; also see Ouchi et al. 2020 for a recent review).
In stark contrast to this overall decrease of LAE fractions in the EoR are the successful detections and spectroscopic confirmations of particularly luminous, continuum-selected galaxies at  > 7. From a UV-luminous sample with  UV ∼ −22 mag identified in the CAN-DELS fields (Roberts-Borsani et al. 2016), all four galaxies were subsequently confirmed spectroscopically through their Ly line (Oesch et al. 2015;Zitrin et al. 2015;Roberts-Borsani et al. 2016;Stark et al. 2017;Pentericci et al. 2018), and in some cases also through other UV emission lines.These galaxies thus first hinted at a trend of larger Ly fractions in more luminous galaxies at  > 7 compared to fainter sources (see also Ono 2012;Mason et al. 2018).This has subsequently been confirmed with larger statistics (e.g., Jung et al. 2020;Endsley et al. 2021b;Laporte et al. 2021), albeit not all luminous sources at  > 7 revealed detectable Ly.
Several possible explanations for larger detection rates of Ly in luminous sources have been discussed in the literature.In particular, the sample from Roberts-Borsani et al. (2016) was initially selected through an IRAC excess, which originates from extremely strong [OIII]+H emission lines, with rest-frame equivalent widths in excess of 800 Å.Such strong lines seem to be typical at these redshifts (e.g., Labbé et al. 2013;Smit et al. 2015;De Barros et al. 2019;Endsley et al. 2021a).The strong radiation field required to power such extreme lines might also impact the local environment around these galaxies, increasing the transmission of Ly (Stark et al. 2017;Endsley et al. 2021b).An additional contribution of ionizing photons could come from an active galactic nucleus (AGN), which might be present in some luminous sources (Tilvi et al. 2016;Laporte et al. 2017;Mainali et al. 2018;Endsley et al. 2021b).Additionally, luminous sources have been observed to show significant velocity offsets between their Ly lines and the systemic redshifts, likely due to strong outflows (e.g., Erb et al. 2014;Willott et al. 2015;Stark et al. 2017;Hashimoto et al. 2019;Matthee et al. 2020).This would shift the red wing of the Ly lines further away from the IGM damping wing, further increasing transmission.However, it is still not clear whether this effect is large enough to explain the observed luminosity-dependent evolution of the Ly fraction over  = 6 − 8 (e.g., Mason et al. 2018).
Another possible explanation for a very high Ly fraction in luminous sources could be that they are embedded in the most overdense regions.These overdense volumes are expected to reionize first thanks to the combined emission from both the luminous galaxies and the numerous fainter sources around them (e.g., Barkana & Loeb 2004;Dayal et al. 2009;Pentericci et al. 2011;Endsley et al. 2021b).Indeed, the bright galaxies themselves are most likely unable to emit enough ionizing photons to produce an ionized bubble large enough to allow Ly photons to escape (Wyithe & Loeb 2005).
Observationally, several galaxy overdensities have now been identified during the EoR both for continuum and narrow-band selected galaxies (e.g., Trenti et al. 2012;Ishigaki et al. 2016;Hu et al. 2021).In several cases, there is strong evidence for a connection between galaxy overdensities and overlapping ionized bubbles that allow Ly to be transmitted (e.g., Castellano et al. 2016;Tilvi et al. 2020;Endsley et al. 2021b).In this scenario, reionization starts in the first overdense regions, implying large spatial fluctuations in the ionized fraction and an inhomogeneous topology.This is indeed what is found in cosmological simulations (e.g., Iliev et al. 2006;Hutter et al. 2017;Geil et al. 2017;Kulkarni et al. 2019;Gronke et al. 2020;Ocvirk et al. 2020;Qin et al. 2021;Smith et al. 2021).
In this paper, we extend the search for overdensities and possible ionized bubbles to very high redshifts.In particular, we obtained dedicated HST imaging around three of the most distant and UV-luminous Ly-detected galaxies, in order to search for fainter neighbouring galaxies.In total, we study the environment of four galaxies in the EGS field: three confirmed Ly emitting sources at  = 7.5 − 8.7, in addition to a photometrically selected galaxy at  ∼ 9 (see Sect. 2).
The paper is organized as follows: in Section 2, we present the HST and ancillary datasets used.Section 3 focuses on the selection of our galaxy sample, before we quantify the environment around the UV luminous targets in Section 4 and compare our results with simulations in Section 5. Finally, we discuss implications for the Lyman-alpha visibility of luminous reionization-era galaxies in Section 6.

Ancillary Data
The EGS field is one of the five extragalactic fields observed with the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) program (Koekemoer et al. 2011;Grogin et al. 2011).
It is centred at (RA, DEC)=(214.825000,+ 52.825000) and was originally covered with 2-orbit exposures of Advanced Camera for Surveys (ACS)  606 ,  814 and Wide Field Camera 3 (WFC3/IR)  125 and  160 imaging (0.67 orbits in F125W, and 1.33 in F160W).The 3D-HST survey covers the field with  140 (F140W) imaging (Skelton et al. 2014;Momcheva et al. 2016), and a few individual follow-up programs have added targeted  105 (F105W) images for a few sources (e.g., Bouwens et al. 2016).Additionally, the field has extensive multi-wavelength data from ground-based facilities and was covered with Spitzer's Infrared Array Camera (IRAC) as part of the S-CANDELS program (Ashby et al. 2015).The total observed area of the EGS field is ∼200 arcmin 2 .For more information on the ancillary multi-wavelength data in this field see, e.g., Skelton et al. (2014) and Stefanon et al. (2017).

Dedicated Follow-up of Luminous High-Redshift Sources
The EGS field is especially rich in early galaxies (e.g., Bouwens et al. 2015).It has revealed several particularly luminous high-redshift candidates at  > 7 (e.g.Bouwens et al. 2015Bouwens et al. , 2019;;Roberts-Borsani et al. 2016;Finkelstein et al. 2021), three of which have subsequently The darker grey areas show fields where F105W imaging is available and that have been included in our search.The three pointings from our own program were centred on EGS-zs8-1 (orange outline), EGS-zs8-2 (yellow) and EGSY-z8p7 (pink).Also highlighted (in green) is the F105W outline around EGS910-3 presented in Bouwens et al. (2016) that we include in our analysis.Black squares correspond to the location of Ly emitters.The colourbars indicate the photometric redshifts of the sources.The sizes of the circles represent their  160 magnitude.Also indicated is the 3D distance of 9 physical Mpc between the two Ly galaxies at  ∼ 7.6 on the left.been confirmed through Ly emission lines.These are EGS-zs8-1 at  spec = 7.7302 ± 0.0006 (Oesch et al. 2015; see also Stark et al. 2017;Tilvi et al. 2020), EGS-zs8-2 at  spec = 7.4770 ± 0.0008 (Roberts-Borsani et al. 2016;Stark et al. 2017), and EGSY-z8p7 at  spec = 8.683 +0.001 −0.004 (Zitrin et al. 2015; see also Mainali et al. 2018).All of these sources are expected to exhibit strong radiation fields due to high-equivalent width [OIII]+H emission lines identified through their Spitzer/IRAC photometry (Roberts-Borsani et al. 2016).Indeed, highly excited UV lines have been confirmed in the MOSFIRE spectra of EGS-zs8-1 (C ; Stark et al. 2017) and EGSY-z8p7 (N ; Mainali et al. 2018).Especially the detection of the N line in EGSY-z8p7 points to a non-negligible, possible contribution from an AGN.
While other particularly distant EoR galaxies have been spectroscopically confirmed since the discovery of these sources in the EGS field (e.g.Oesch et al. 2016;Jiang et al. 2021;Hashimoto et al. 2018;Jung et al. 2020;Laporte et al. 2021), they still rank among the most distant detected Ly emission lines (see also Larson et al., in prep;Larson et al. 2020).For more information on the luminous Ly sources, see Section 3.1 and Tables 1 and 2.
To explore whether these very early Lyman-alpha emitters commonly reside in overdense regions, we have obtained dedicated, additional multi-colour HST WFC3/IR observations around these three luminous galaxies (GO-15103;De Barros 2017).In particular, we added the missing F105W filter that is necessary to reliably identify  ∼ 7.5 − 9 galaxies and measure improved redshifts, in addition to deeper F125W and F160W imaging.The program was designed to identify galaxies that are up to an order of magnitude less luminous than the main targets, which have   = 25.0 − 25.3 mag.In particular, we obtained 5-orbit exposures in three individual pointings centred on each of the three luminous LAEs (2 orbits in F105W, 1.5 in F125W, and 1.5 in F160W).When combined with the previous, CANDELS HST imaging our data reach 5 limits of 27.9 mag in F105W, 27.5 mag in F125W, and 27.5 mag in F160W, respectively.
In addition to our dedicated HST data, we retrieved all the HST ACS and WFC3/IR data available around the EGS sources from the HST archive and combined them into one mosaic as part of the Hubble Legacy Field project (Illingworth et al, in prep).

Source Detection and Multi-Wavelength Photometry
The HST images in different bands were convolved to the same point-spread function (PSF; using convolution kernels analogous to the ones made available by the 3D-HST team; Skelton et al. 2014).We then use SExtractor (Bertin & Arnouts 1996) to detect sources in a combined F125W+F160W image and measure their colours through matched, circular apertures.In particular, we use apertures with 0. 5 diameter to measure colours, which are then corrected to total fluxes using the SExtractor AUTO fluxes in the F160W image.
In order to obtain IRAC photometry in all four channels, we follow the same procedure as described in Stefanon et al. (2021a).To overcome confusion, we subtract contaminating flux of neighbouring galaxies using the Mophongo tool (see, e.g., Labbé et al. 2013Labbé et al. , 2015)), before measuring the IRAC fluxes in 1. 8 diameter apertures and correcting to total fluxes.When the contamination by neighbour is too large for a reliable subtraction, the IRAC fluxes are flagged and ignored in the subsequent spectral energy distribution fits.We use the same procedure to derive photometry from ground-based K-band images.Overall, we thus obtain 11-band spectral energy distributions (SED) for all sources (6× HST, K-band, and 4× IRAC), spanning 0.6 to 8 m.

Central LAE Targets
Our HST follow-up observations are centred around three UV luminous galaxies at  ≥ 7.5 that have been confirmed through Ly in the EGS field (see Section 2.2).One of these, EGS-zs8-1, has subsequently been observed through narrow-band observations, from which two fainter LAEs have been identified in its immediate proximity (Tilvi et al. 2020).These latter sources have also been confirmed through Ly spectroscopy and they were named z8_SM at  = 7.767 and z8_4 at  = 7.748.They likely sit in a common, ionized bubble with EGS-zs8-1, since they are only separated by 0.7 to 0.8 physical Mpc (pMpc) from each other and their estimated ionized bubble radii reach 0.5 − 1.0 pMpc.
Additionally, the galaxy group around EGS-zs8-1 is separated by only 9.1 pMpc from the other UV-luminous, spectroscopically confirmed source EGS-zs8-2 at z=7.48.This distance is smaller than the average bubble radius of 10 pMpc expected at the end of cosmic reionization (Wyithe & Loeb 2005;Geil et al. 2017), and points at a larger scale overdensity in this field at this redshift.One of the main goals of our follow-up observations was to further characterize this overdensity through fainter sources.Hence, we centred two of our HST pointings around EGS-zs8-1 and EGS-zs8-2, respectively (orange and yellow outlines in Fig 1).
The last pointing was centred around EGSY-z8p7, one of the most distant known Ly emitters with  spec = 8.683 (pink outline in Fig 1).This source has a potential companion at < 10 pMpc: the galaxy EGS910-3 (Bouwens et al. 2016, see also Bouwens et al. 2019).On the sky, these two sources are separated by only 3.7 arcmin.However, the source only has a photometric redshift measurement ( phot = 9.0 +0.5 −0.7 as derived in Bouwens et al. 2016) and has not been confirmed through spectroscopy so far.Hence, the actual physical separation from EGSY-z8p7 is uncertain.The source EGS910-3 has also been targeted by F105W follow-up observations, which we retrieve from the archive and include in our analysis (see, e.g., Fig A1).

Selection Criteria
The main goal of our program was to identify fainter sources around the central UV-luminous targets.We thus use the HST-detected multiwavelength catalogs described in Section 2.3 to select candidate highredshift sources.Specifically, we use a combination of detection and non-detection criteria as expected for Lyman break galaxies together with a photometric redshift selection.We only consider sources that are significantly detected in both the F125W and F160W bands with the following criteria: where (/) det corresponds to the signal-to-noise in the combined F125W and F160W image.These criteria allow us to identify sources down to  160 = 27.7 mag.Additionally, the sources were required to not be detected in the two optical HST filters with: / ( 606 ) < 2 ∧ / ( 814 ) < 2. In selecting these sources, we have only included areas that also have F105W imaging coverage, either from our own follow-up program or from archival observations (darker areas in Fig 1).Photometric redshifts are then derived for all the sources that pass the above criteria.We use the python version of the publicly available code EAZY1 (Brammer et al. 2008) which provides the photometric redshift probability distribution function (pdf), P (), based on a  2 fit of SED models to the observed photometry.Specifically, we use the template set 'eazy_v1.1'which includes emission lines.We have also tested other template sets, but have not found large differences.EAZY also accounts for the intergalactic absorption by neutral hydrogen (Madau 1995).The redshifts have been searched over 0.1 ≤  ≤ 12.
Based on the EAZY output we then only include sources that have a best-fit photometric redshift in our range of interest, depending on the central target sources.For the fields around EGS-zs8-1 and -2, we select galaxies with  phot = 7 − 8. To ensure that these redshifts are reasonably accurate, we exclude galaxies with a photometric redshift pdf that is too wide, i.e., galaxies are required to have P (6 ≤  ≤ 9) ≥ 50%.For the fields around EGSY-z8p7, we select sources with  phot > 8 that have P (7 ≤  ≤ 10) ≥ 50%.In Figure A1, we illustrate the SEDs and redshift pdfs of the central UV-luminous Ly emitters and EGS910-3.As can be seen, the photometric redshift determinations of the Ly sources are in excellent agreement with their spectroscopic redshifts, with a very narrow P ().For the last source, we confirm the photometric redshift solution from Bouwens et al. (2016), as we measure a consistent value of  phot = 9.18 +0.95  −0.55 .

Exclusion of Dwarf Stars
Cool dwarf stars can mimic the colours of high-redshift Lyman break galaxies (e.g.Wilkins et al. 2014).They can remain undetected in optical bands (in F606W and F814W filters) and peak in the Nearinfrared (NIR).To remove such sources from our catalogs of LBG candidates we use a combination of two different measurements.First, we identify potential unresolved sources using the SExtractor parameter CLASS_STAR, and second we run EAZY with a set of dwarf star templates fixed at  = 0 to compare the  2 from both dwarf stars and galaxy templates.The CLASS_STAR parameter on its own is not reliable, especially at faint magnitudes.Objects with CLASS_STAR more than 0.6 and  2 from star templates less than  2 from galaxy template are removed from the catalogs: In total, only two sources have been removed based on these criteria.Visual inspection of these sources indeed confirmed them to be unresolved, and hence more likely to be stars than high-redshift galaxies.

Final Sample
After excluding two likely dwarf stars, our final sample of highredshift galaxies within the four WFC3/IR pointings around the four UV-luminous central targets consists of 21 sources in total.Their positions are shown in Fig. 1.In the following we will discuss the individual environment around the four central sources.
With 8 sources, the pointing around EGS-zs8-1 revealed the largest number of high-redshift galaxy candidates at  = 7 − 8 (including the central source itself).The EGS-zs8-2 field resulted in 5 sources.Note that the spectroscopically confirmed source z8_4 is not included in our photometric sample due to a photometric redshift that was too low.For completeness, we show the galaxy in Fig. 1.
At higher redshifts, the fields around EGSY-z8p7 and EGS910-3 reveal 4 sources with  phot > 8 each.All these sources are listed in Tables 1 and 2. The photometric redshift distribution functions and stamps for the luminous sources can be found in the appendix in Fig A1 , while the fainter, neighbouring sources are shown in Fig. A2 to  A5. † This source has a close companion, which was excluded from our list due to its photometric redshift just below the cutoff of our selection.

THE ENVIRONMENT OF UV LUMINOUS GALAXIES
A principle goal of our work is to quantify the environment and possible overdensities of UV luminous sources with Ly emission at  > 7. To do this, we first need to estimate the expected number of galaxies in a given area, which we do based on two different approaches that are described in the following subsections.

Based on the UV Luminosity Function
The first approach to estimate how many galaxies we expect to find within a given area is based on the UV luminosity function (LF) and the galaxy selection efficiency.Following previous literature (e.g.Castellano et al. 2016), we estimate the selection efficiency of our survey by simulating the photometry for a large number of high-redshift galaxies.Specifically, we introduce > 60, 000 artificial galaxies with a range of known (input) magnitudes 24 − 29 mag, colour distribution, and redshifts 6 <  < 10.Since high-redshift galaxy selections can be sensitive to the UV The left panels show the cumulative numbers as a function of absolute magnitude.The blue line corresponds to the expected number based on the UV LF and our selection function, while the coloured lines correspond to the actual numbers of galaxies that we have identified in the HST data in the different fields (orange for EGS-zs8-1; yellow for EGS-zs8-2; pink for EGSY-8p7; green for EGS910-3).The shaded regions correspond to the 16th to 84th percentile uncertainties for a Poisson distribution convolved with cosmic variance (see text for details).The right panels show the total number of sources selected down to our magnitude limits using the same colours as for the cumulative histograms,with the points offset horizontally for clarity.Additionally, the purple histograms show the distribution of high-redshift galaxies selected in 82 independent 4.5 arcmin 2 cells in the GOODS-S and GOODS-N fields, using the same selection criteria as for our HST data in the EGS field (see section 4.1.2).The dashed black line corresponds to the mean in the GOODS fields, which is consistent with the mean expected number from the UV LF (blue).As is evident from these figures, the environment around all four UV luminous sources are overdense relative to the general field.The enhancement ranges from 3× to 9×. colours, we simulate the artificial sample with a colour distribution that matches observations.To do this, we start with a Bruzual & Charlot (2003) template with an age of 100 Myr and a constant starformation history.Calzetti et al. (2000) dust is then applied following a normal distribution of E(B-V) with a mean of 0.1 and a  = 0.15, truncated at zero (following Finkelstein et al. 2021).Nebular continuum and emission lines have been added empirically based on the total number of ionizing photons produced by the underlying SED and using line ratios from Anders & Fritze-v. Alvensleben (2003).Finally, IGM attenuation is applied according to the prescription of Madau (1995) and set by the simulated redshift of each galaxy.For  > 7, this essentially corresponds to a sharp break at the redshifted Ly line.The final SEDs are then convolved with all the HST and Spitzer IRAC transmission curves to derive the expected, intrinsic fluxes.These are subsequently disturbed using Gaussian flux uncertainties, based on the median flux  as measured for real sources in our HST footprints in the respective filters.
We then perform the same analysis as for the real catalog.First, the detection and optical non-detection criteria are applied and the EAZY photometric redshifts and pdfs are computed as was done for the real galaxies.This allows us to compute the completeness in bins of input magnitude and redshift as the ratio between the number of galaxies that pass our selection and the number of input galaxies in the respective bin: Now that we have the selection efficiency, it is straightforward to compute the expected number of galaxy detections based on the empirical UV LF.To do this, we use the Schechter parameter evolution derived in Bouwens et al. (2021) at  ∼ 2 − 10: The expected number of detections can then be derived by multiplying the UV LF with the probability that a galaxy of absolute magnitude  at redshift  is selected, P (, ), within the same area as used for real sources.
where () is the UV LF, Ω is the survey area for each HST field, i.e., 4.5 arcmin 2 for a WFC3/IR pointing.
For the  ∼ 7 − 8 galaxy samples around EGS-zs8-1 and -2, we derive an expected number of 1.57 galaxies per WFC3/IR pointing down to our detection limit of 27.7 mag (see Fig. 2).For the  ∼ 8 − 9 samples, the expected number is only 0.45 sources per WFC3/IR field.

Empirical Approach
Another way to determine whether a given HST field might be overdense can be derived empirically using the number counts from another field that is as deep or deeper than the field in question.We can thus use the CANDELS/GOODS-North and CANDELS/GOODS-South fields with an area of ∼170 arcmin 2 each, from which high redshift galaxy sources have been selected in the past.Specifically, we use the sample from Bouwens et al. (2015), for which multiwavelength catalogs are available from GREATS (Stefanon et al. 2021b).For consistency, we re-derive the photometric redshifts and pdfs for this sample using the same EAZY setup as for our main selection.We then select the groups of sources with best-fit redshifts at 7 ≤  ≤ 8 and 8 ≤  ≤ 9. From these, we limit the sources at  160 = 27.7 mag, which corresponds to the 5 detection limit in our EGS data around the Ly emitters.
To get a histogram of expected sources in an area equivalent to a single WFC3/IR pointing, we divide the GOODS-S and GOODS-N fields into circular cells with radius 1.2 arcmin, corresponding to an area of 4.5 arcmin 2 each, and count the number of the selected The purple curve shows the LFs for the  = 7 − 8 galaxies, while the cyan line corresponds to  = 8 − 9.This analysis shows that these UV luminous sources lie in areas with a general density enhancement of 3 − 9× that extends to fainter galaxies.
sources within each cell.The positioning of the cells was chosen on a grid allowing for a small (< 10%) overlap between different areas.In total, the GOODS-S field was thus split into 42 independent regions, while the GOODS-N field was divided into 40 separate cells.
The number count histograms of these 82 regions are also shown in Fig. 2. The average number of sources per cell is 1.2 at  ∼ 7 − 8 and 0.45 at  ∼ 8 − 9, which is consistent with the average expected number from the UV LF derived in the previous section.However, the empirical histograms further allow us to visualize the effect of cosmic variance.For example, out of 82 independent cells in both GOODS fields, none contain 8 sources, while only 2 contain ≥5  ∼ 7 − 8 galaxies.Empirically, finding ≥5 galaxies is thus expected only with a 2.4% probability.
At higher redshift, the largest number of galaxies in a given 4.5 arcmin2 region is 3. Based on this, our finding of 4 (5-8) sources around the UV luminous LAEs at  > 8 ( = 7 − 8) clearly indicates that these galaxies live in early overdensities.We quantify this in more detail in the next section.

Prevalent Overdensities Around UV Luminous Galaxies
We now ready evaluate the environment of the HST fields around the UV luminous sources.We start with the two Ly sources at  ∼ 7.5, around which we find 5 and 8 sources in 4.5 arcmin 2 .This is 3× and 5× overdense relative to the expected number of 1.57 sources based on the UV LF.
In order to compute more accurate probabilities to find 5 or 8 sources when 1.57 are expected we specifically account for cosmic variance.To do this, we use the publicly available cosmic variance calculator galcv 2 of Trapp & Furlanetto (2020).For a 4.5 arcmin 2 field and a depth of Δ = 1, this predicts a relative   of 32% at  = 7.5 and 38% at  = 8.5, down to our magnitude limits.These numbers are similar (albeit somewhat smaller) than the predicted cosmic variance from Trenti & Stiavelli (2008).
The full probabilities are then derived from a Monte Carlo simulation in which we first perturb the expected numbers of galaxies from the UV LF by the relative cosmic variance and then draw from the Poisson distribution.Doing this, we find that the probability to find 8 sources, when 1.57 are expected is indeed small with 0.08%.This corresponds to a 3.2 result.Similarly, the probability to find ≥ 5 galaxies is only 3.1%, consistent with our empirical estimate above.This still represents a significance of 1.9.
For the higher redshift fields, we find 4 sources in each pointing, while only 0.46 were expected on average.This corresponds to an enhancement of 8.7×.Doing the same calculation as above, we find that the probability of finding 4 (or more) sources is only 0.2%, corresponding to a 2.9 result for each field separately.These calculations show that overdensities around UV luminous sources have been significantly detected in all four cases, with overdensity values ranging between 3 − 9×.
This excess can also be seen when looking at the UV LF derived in our target fields alone.Fig. 3 shows these UV LFs at  = 7 − 8 and  = 8 − 9 as derived from the two WFC3/IR pointings centred at the respective redshifts.To compute this, we use the same selection functions as derived in Section 4.1.1 and compute the effective volume in a given magnitude bin.Given the small area (of 9 arcmin 2 ) centred on UV luminous galaxies, the UV LF in these fields is significantly enhanced in the brightest bins (with    ∼ −22).However, as can be seen, at the fainter end these UV LFs continue to lie significantly above the expected field average values.This is consistent with the hypothesis that these UV luminous galaxies lie in overdensities that are generally enhanced relative to the rest of the Universe.Our results thus indicate that the most UV luminous sources can indeed be used to pinpoint the first overdense regions in the early Universe.

EXPECTATIONS FROM SIMULATIONS
In this section we discuss what simulations predict for the environment of UV luminous galaxies similar to those we have observed.
Specifically, Qin et al. (2021) studied the environment of  = 8 galaxies in the M semi-analytical model (Mutch et al. 2016;Qin et al. 2017) as part of the D simulation program (Poole et al. 2016).Qin et al. (2021) predicted that the most luminous galaxies preferentially lie in overdense regions, in agreement with what we find here from observations.In particular, Qin et al. (2021) used their simulations to compute the number of expected sources within a WFC3/IR field-of-view.This can be directly compared to our observational data.

Number of Neighbouring Galaxies
Figure 4 shows the cumulative number of galaxies expected around the eight most luminous central sources (having  UV < −21.5) in the simulations down to the same magnitude limit as used in our observations.As can be seen, the cumulative number varies between 3 to 13 sources, with a median of 8.5.This is in excellent agreement with our findings around the two sources EGS-zs8-1 and EGS-zs8-2.Both the simulations and the observations thus find a boost in the number counts around luminous sources compared to the field by ∼ 3 − 5×.

Ionized Bubble Sizes
The M simulation further allows us to measure the radii of the ionized bubbles around galaxies.Qin et al. (2021) show that the most luminous sources sit in some of the largest ionized bubbles, at  ∼ 8, both in simulation and in our observations.The left panel shows the cumulative numbers as a function of absolute magnitude.The coloured lines correspond to the actual numbers of galaxies that we have identified in the HST data in different fields (orange for EGS-zs8-1; yellow for EGS-zs8-2).
The grey lines correspond to the numbers from the D simulation (Qin et al. 2021).The right panel shows the total number of sources as a function of the magnitude of the brightest source selected down to our 27.7 magnitude limit using the same colours as for the cumulative histograms.The blue line corresponds to the expected number based on the UV LF and our selection function (same as Fig 2 ), with the shaded blue region showing the 16th to 84th percentile.The horizontal black dashed line shows the average number of sources in the GOODS fields within the same area.Both the simulations and the observations thus find a boost of ∼ 3 − 5× in the number of galaxies around the most luminous sources.
resulting in a higher total transmission of the Ly emission line (see also Wyithe & Loeb 2005;Geil et al. 2017;Hutter et al. 2017).
The most luminous simulated central sources lie in ionized regions with measured radii of around 8-9 comoving Mpc, corresponding to ∼ 1 physical Mpc at these redshifts (see Fig. 5).We can thus expect similar ionized region sizes around our observed galaxies.Using simple estimates, this is indeed what we infer as discussed below.
To convincingly conclude that all of the associated sources we find for our sample lie within the respective ionised bubbles would require both spectroscopic confirmation of the association and, ideally, the presence of Lyman alpha emission.Nonetheless, using the same approach as in Matthee et al. (2018) we can estimate the sizes of the ionized bubbles that each of our sources individually could carve out of the neutral IGM.Assuming a spherical shape of the ionized regions around the galaxies, the expected radii range from 0.5 pMpc up to 1.1 pMpc.These results are consistent with previous estimates in the literature for possible bubble sizes at these redshifts (see e.g., Castellano et al. 2018;Tilvi et al. 2020;Endsley et al. 2021b).
While Tilvi et al. (2020) have confirmed the physical association of two sources around EGS-zs8-1 through spectroscopy, the new galaxies identified in this paper only have photometric redshifts available at the moment.The respective uncertainties in 3D distances are thus too large to determine the physical associations of these galaxies with the central UV luminous targets.However, our detection of an enhancement of fainter galaxies around all these sources points at a larger scale overdensity in the EGS field (see also Bouwens et al. 2015Bouwens et al. , 2019;;Finkelstein et al. 2021).
Interestingly, the two UV luminous LAEs EGS-zs8-1 and EGS-zs8-2 are only separated by 9.1 pMpc from each other.It is thus possible that these two sources lie in a larger, ionized region that spans from  = 7.5 to  = 7.7.Future observations, e.g., with JWST spectroscopy will be needed to confirm this.

SUMMARY AND CONCLUSIONS
In this paper, we identified faint sources at  > 7 around three confirmed UV luminous Lyman  emitters and another, nearby  phot ∼ 9 target in the CANDELS/EGS field.We presented dedicated HST data and combined this with ancillary HST and Spitzer/IRAC imaging in this field.
We find a significant enhancement of fainter galaxies within a WFC3/IR pointing of 4.5 arcmin 2 around each of these UV luminous, central sources.By comparing the number of detected galaxy candidates with the expected numbers from the UV LFs at these redshifts and with the numbers in the two GOODS fields, we estimate these areas to be enhanced by 3 − 9× compared to the average field.
Our observational findings are in excellent agreement with the predictions form the M simulation (Qin et al. 2021), which also find a boost in fainter neighbouring sources around UV luminous galaxies.In these simulations, the combined ionizing photon output of the ensemble of galaxies around the central sources produce ionized regions with radii of ∼ 1 pMpc, allowing Ly photons to be transmitted even when the surrounding IGM is still highly neutral.
Combined with these simulation results, our observational finding of overdense regions around these luminous  = 7.5 − 8.7 Ly emitters thus strongly suggest that the local galaxy environment plays a driving role in the Ly transmission.Specifically, the field around EGSY-z8p7 at  spec = 8.683 is enhanced by a factor ∼ 8.7×, suggesting that this source might sit in one of the first overdensities and ionized bubbles in the Universe, only ∼ 500 Myr after the Big Bang.
The crucial next step is to obtain spectroscopic redshifts for these overdensities in order to map out their 3D structure and estimate the size of the overall ionized regions.The approved JWST cycle 1 program GO-2279 (Naidu et al. 2021) will achieve exactly this based on NIRCam/grism observations of EGSY-z8p7 and EGS910-3, which is only separated by 3.7 arcmin.These observations will provide a sample of [OIII]5007 emission line selected sources at  = 7 − 9, and will thus significantly improve the overdensity measurement.Other NIRCam/grism and NIRSpec observations over other fields are poised to identify a large number of ionized bubbles in the EoR in the near future.This will further clarify the importance of other possible origins for enhanced Ly transmission from UV luminous galaxies, such as velocity offsets or very hard and strong ionization fields, e.g., from an AGN contribution.
The fact that we find an overdensity around each of the three UV luminous Ly emitting galaxies only based on photometric redshifts strongly suggests that the Ly transmission is intimately connected to the overall galaxy environment that helps to create ionized regions.Turning this around, these UV luminous, early Ly emitters can be used to pinpoint the earliest overdense regions in the Universe, where star-formation and reionization first started.This will become particularly interesting with the availability of larger samples of UV luminous LBGs and LAEs from future wide-area surveys with the Euclid or Roman Space Telescopes and follow-up spectroscopy.

Figure 1 .
Figure1.Position of all selected sources in the CANDLES/EGS field centred around our target galaxies.Sources with redshift 7 <  phot < 8 are shown in the left panel, while galaxies with  phot > 8 are indicated on the right.The fields are shown North up, East left.The light shaded region shows the CANDELS  160 coverage.The darker grey areas show fields where F105W imaging is available and that have been included in our search.The three pointings from our own program were centred on EGS-zs8-1 (orange outline), EGS-zs8-2 (yellow) and EGSY-z8p7 (pink).Also highlighted (in green) is the F105W outline around EGS910-3 presented inBouwens et al. (2016) that we include in our analysis.Black squares correspond to the location of Ly emitters.The colourbars indicate the photometric redshifts of the sources.The sizes of the circles represent their  160 magnitude.Also indicated is the 3D distance of 9 physical Mpc between the two Ly galaxies at  ∼ 7.6 on the left.

Figure 2 .
Figure2.The cumulative number of the sources within an area of 4.5 arcmin 2 at  = 7 − 8 (left) and  > 8 (right).The left panels show the cumulative numbers as a function of absolute magnitude.The blue line corresponds to the expected number based on the UV LF and our selection function, while the coloured lines correspond to the actual numbers of galaxies that we have identified in the HST data in the different fields (orange for EGS-zs8-1; yellow for EGS-zs8-2; pink for EGSY-8p7; green for EGS910-3).The shaded regions correspond to the 16th to 84th percentile uncertainties for a Poisson distribution convolved with cosmic variance (see text for details).The right panels show the total number of sources selected down to our magnitude limits using the same colours as for the cumulative histograms,with the points offset horizontally for clarity.Additionally, the purple histograms show the distribution of high-redshift galaxies selected in 82 independent 4.5 arcmin 2 cells in the GOODS-S and GOODS-N fields, using the same selection criteria as for our HST data in the EGS field (see section 4.1.2).The dashed black line corresponds to the mean in the GOODS fields, which is consistent with the mean expected number from the UV LF (blue).As is evident from these figures, the environment around all four UV luminous sources are overdense relative to the general field.The enhancement ranges from 3× to 9×.

Figure 3 .
Figure 3. UV luminosity functions in the EGS field centred around the four UV-luminous sources (dots) compared to expected average Schechter LFs as derived in Bouwens et al. (2021, solid lines), and double power-law LFs Bowler et al. (2020, dashed lines).The purple curve shows the LFs for the  = 7 − 8 galaxies, while the cyan line corresponds to  = 8 − 9.This analysis shows that these UV luminous sources lie in areas with a general density enhancement of 3 − 9× that extends to fainter galaxies.

Figure 4 .
Figure4.The cumulative number of the sources within an area of 4.5 arcmin 2 at  ∼ 8, both in simulation and in our observations.The left panel shows the cumulative numbers as a function of absolute magnitude.The coloured lines correspond to the actual numbers of galaxies that we have identified in the HST data in different fields (orange for EGS-zs8-1; yellow for EGS-zs8-2).The grey lines correspond to the numbers from the D simulation(Qin et al. 2021).The right panel shows the total number of sources as a function of the magnitude of the brightest source selected down to our 27.7 magnitude limit using the same colours as for the cumulative histograms.The blue line corresponds to the expected number based on the UV LF and our selection function (same as Fig 2), with the shaded blue region showing the 16th to 84th percentile.The horizontal black dashed line shows the average number of sources in the GOODS fields within the same area.Both the simulations and the observations thus find a boost of ∼ 3 − 5× in the number of galaxies around the most luminous sources.

Figure 5 .
Figure 5. Number of neighboring galaxies brighter than H 160 =27.5 mag for EGS-zs8-1, EGS-zs8-2 and EGSY-z8p7 (see the star symbols) as a function of their UV magnitudes.Theoretical expectation using the DRAGONS simulation (see more in e.g., Fig. 2 of Qin et al. 2021) is shown with circles for comparison.The size of the corresponding H bubbles (in units of cMpc; indicated by varying circle sizes) is shown with the uncertainties drawn from 500 mock observations.

Figure A1 .
Figure A1.Multi-wavelength images and SEDs of the UV-luminous central targets: three Lyman  emitters and EGS910-3.For each source, the top panels show, from left to right, HST/ACS  606 ,  814 , HST/WFC3  105 ,  125 ,  160 , and Spitzer/IRAC 3.6 and 4.5 stamps.The lower panels show the SED (purple curve with uncertainties), the photometry (black squares with uncertainties), and the resulting probability distribution of redshift P () (green curve).The pink vertical lines represent the spectroscopic redshift of the Ly emitters, showing that the photometric redshifts are accurate for these sources.The last source, EGS910-3, does not have a spectroscopic redshift measurement yet.Our photometric redshift is in excellent agreement with the original determination of Bouwens et al. (2016).

Figure A2 .
Figure A2.Redshift probability distribution functions P () and stamps of fainter  phot > 8 sources in the pointing around EGSY-z8p7.The P () of the UV luminous central target is also shown (orange line), indicating that the neighbouring sources have a significant probability to be physically associated to the central source.The stamps show the same filters as A1 and are labelled on the top.

Figure A3 .
Figure A3.The same as Figure A2 for the neighbouring sources around EGS910-3 with  phot > 8.

Table 1 .
Table with parameters of all sources with redshift  ∼ 7 − 8 identified in the F105W footprints around the two LAEs.

Table 2 .
Table with parameters of all sources with redshift  ∼ 8 − 9 identified in the F105W footprints around the two  > 8 galaxies.
Separation from the central luminous source in arcmin.* This source lies very close to a clumpy foreground galaxy, which could affect its photometric measurements.