The MUSE Extremely Deep Field: the Cosmic Web in Emission at High Redshift

We report the discovery of diffuse extended Ly-alpha emission from redshift 3.1 to 4.5, tracing cosmic web filaments on scales of 2.5-4 comoving Mpc. These structures have been observed in overdensities of Ly-alpha emitters in the MUSE Extremely Deep Field, a 140 hour deep MUSE observation located in the Hubble Ultra Deep Field. Among the 22 overdense regions identified, 5 are likely to harbor very extended Ly-alpha emission at high significance with an average surface brightness of $\mathrm{5 \times 10^{-20} erg s^{-1} cm^{-2} arcsec^{-2}}$. Remarkably, 70% of the total Ly-alpha luminosity from these filaments comes from beyond the circumgalactic medium of any identified Ly-alpha emitters. Fluorescent Ly-alpha emission powered by the cosmic UV background can only account for less than 34% of this emission at z$\approx$3 and for not more than 10% at higher redshift. We find that the bulk of this diffuse emission can be reproduced by the unresolved Ly-alpha emission of a large population of ultra low luminosity Ly-alpha emitters ($\mathrm{<10^{40} erg s^{-1}}$), provided that the faint end of the Ly-alpha luminosity function is steep ($\alpha \lessapprox -1.8$), it extends down to luminosities lower than $\mathrm{10^{38} - 10^{37} erg s^{-1}}$ and the clustering of these Ly-alpha emitters is significant (filling factor $<1/6$). If these Ly-alpha emitters are powered by star formation, then this implies their luminosity function needs to extend down to star formation rates $\mathrm{<10^{-4} M_\odot yr^{-1}}$. These observations provide the first detection of the cosmic web in Ly-alpha emission in typical filamentary environments and the first observational clue for the existence of a large population of ultra low luminosity Ly-alpha emitters at high redshift.


Introduction
The current paradigm of structure formation predicts that most of the gas in the intergalactic medium (IGM) is organized in a "cosmic web" composed of nonuniform gaseous filaments connecting galaxies on scales of megaparsecs (e.g., White et al. 1987;Bond et al. 1996). These filaments feed the circumgalactic medium (CGM), the gaseous component that is responsible for regulating the gas exchange between galaxies and the surround- ing IGM (e.g., Tumlinson et al. 2017). This complex interplay between the infall of gas from filaments through the CGM and the ejection of matter from the feedback processes is expected to play a key role in the regulation of galaxy growth through cosmic time.
The IGM has mainly been explored over the past decades using absorption line spectroscopy, which provides a powerful way to trace the neutral hydrogen observed in Lyman-alpha (Lyα) absorption against bright background quasars (Gunn & Peterson 1965;Meiksin 2009). However, it has not been possible to obtain a detailed picture of these filaments as information is limited to one dimension along the line-of-sight to the background source. The low sky density of sufficiently bright background sources prevents the study of the cosmic web on scales smaller than a few megaparsecs. Only very recently have sparse two-dimensional constraints on the IGM structure at a transverse spatial sampling of a few megaparsecs started to become available (e.g., Lee et al. 2018).
Imaging the cosmic web in emission would provide the missing three-dimensional information. Filaments are predicted to emit the hydrogen Lyα line by fluorescence induced by the ultraviolet background (UVB) radiation. However, the low intensity of the UVB (Haardt & Madau 2012) translates into an expected surface brightness of a self-shielded filament of approximately 10 −20 erg s −1 cm −2 arcsec −2 at z = 3 (Gould & Weinberg 1996). This challenging level has meant that a direct detection of UVBinduced fluorescent emission from IGM filaments has remained elusive (Gallego et al. 2018).
One solution to overcome this obstacle is to observe selected regions where local ionizing sources, such as bright quasars (i.e., quasi-stellar objects, QSOs) or star forming galaxies, boost the Lyα emission to detectable levels (e.g., Cantalupo et al. 2005). This technique has been successfully used to map the IGM at scales of a few hundred kpc around QSOs (Cantalupo et al. 2014;Martin et al. 2015;Borisova et al. 2016;Fumagalli et al. 2016;Kikuta et al. 2019). To extend the mapping of the IGM, specific fields with multiple QSOs have recently been targeted using the MUSE instrument (Bacon et al. 2010) at ESO's Very Large Telescope (e.g., Arrigoni Battaia et al. 2019;Lusso et al. 2019).
However, the scale currently probed by all these observations is limited to a few hundred kiloparsecs, a scale larger than the CGM of the host galaxy but still too small to probe the filaments at the megaparsec scale relevant to the IGM. A notable exception are the observations of Umehata et al. (2019), who report the detection of a 1.3 Mpc long filament with a mean Lyα surface brightness of 3 × 10 −19 erg s −1 cm −2 arcsec −2 in the SSA22 protocluster at z = 3.1 (Steidel et al. 1998).
While these observations have revealed some filamentary structures of ionized gas in very massive structures, they are biased to specific environments. For example, the SSA22 protocluster with its large overdensity of active galactic nuclei (AGNs), submillimeter galaxies and Lyα blobs (Lehmer et al. 2009;Umehata et al. 2018Umehata et al. , 2019Herenz et al. 2020) is an extreme environment. Furthermore, the environment of a QSO with anisotropic UV radiation and a possible excess of tidal debris from past interactions (Canalizo & Stockton 2001) may not be representative of the generic IGM.
Another approach was used by Daddi et al. (2020), who targeted a massive structure, corresponding to a dark matter halo of ≈4 × 10 13 M , embedded in a giant Lyα nebula at z = 2.9 with Keck's KCWI instrument (Martin et al. 2010). Within 300 kpc, they find three cold gas filaments connected to the central massive galaxy. Such observations give important constraints for models of the formation of galaxies in these massive structures located at the nodes of the cosmic web.
However, such very massive structures are not representative of the filamentary environment that represents 60% of the total gas mass of the Universe and where we expect most of the galaxy formation to occur (e.g., Libeskind et al. 2018;Martizzi et al. 2019). For example, the conclusions reached by Daddi et al. (2020) that gravitational energy is the main physical mechanism responsible for the Lyα emission may not stand in two orders of magnitude less massive environments.
It is therefore highly desirable to obtain 3D information of the IGM in emission in more representative environments. Having multiple detections of the cosmic web structure and its evolution with redshift would also give fundamental constraints for the simulation and models of structure formation.
In recent years we have used a fraction of our MUSE guaranteed observing time to perform deep field observations in the Hubble Ultra-Deep Field (HUDF, Beckwith et al. 2006) at 10 and 30 h depths ). These observations have increased the amount of available spectroscopic information in the HUDF by more than an order of magnitude (Inami et al. 2017) and enabled several breakthroughs in our understanding of the high redshift universe, notably the discovery of ubiquitous extended Lyα emission from the CGMs around individual galaxies at z > 3 (Wisotzki et al. 2016;Leclercq et al. 2017Leclercq et al. , 2020. We also performed stacking experiments to explore the extended Lyα emission around galaxies at larger distances, extending the results for individual halos to a radius of 60 pkpc (physical kpc) or 240 ckpc (comoving kpc) at z = 3 (Wisotzki et al. 2018). Although such a distance is well beyond the predicted virial radii of the host dark matter halos for the individual Lyα emitters of our sample, the stacking process erases all geometrical information and is not adapted to the morphological study of the filamentary cosmic web.
Given the successful experience with these MUSE spectroscopic deep fields, we recently performed a new deep field to push forward in depth. The so-called MUSE Extremely Deep Field (hereafter MXDF) is a single field located within the HUDF area ( Fig. 1). It reaches a maximum depth of 140 h and benefits from improved spatial resolution thanks to the recent coupling of MUSE with the ESO Adaptive Optics Facility (AOF) as well as improved data reduction processes (Sect. 2).
In this paper we present new results regarding the cosmic web in emission at z > 3 based on this new data set and complemented by existing HUDF datacubes published in Bacon et al. (2017). Galaxy formation should preferentially take place in the densest part of the cosmic web filaments. Therefore, by selecting moderately overdense regions of Lyα emitters, we should maximize our chances of detecting the diffuse Lyα emission 1 associated with these filaments.
We have explored the datacube in redshift space to select overdense regions of Lyα emitters (Sect. 3). For each overdense region, we search for and study diffuse Lyα emission (Sect. 4). We then discuss the implication of our discoveries in Sects. 5 and 6, and finally conclude in Sect. 7.

Observations and data reduction
The details of the observations, data reduction and source catalogs will be given in a forthcoming paper (Bacon et al., in prep.). We provide here a short summary of the process.

Observations
The observing campaign started in August 2018 and lasted until January 2019, for a total of six runs performed during the new moon periods. All observations were performed with the dedicated VLT GALACSI/AOF Ground-Layer Adaptive Optics system (Kolb et al. 2016;Madec et al. 2018). A total of 155 h of integration were obtained. After the rejection of bad exposures, the achieved final depth is 140 h. To minimize systematics, the field of view was rotated by a few degrees between each observing block, which consist of 4 × 25 min exposures. Consequently, the final combined field of view is approximately circular (Fig. 1) with a radius of 41 and 31 for respectively 10+ and 100+ h depths. The field center celestial coordinates are 53 • .16467, −27 • .78537 (J2000 FK5).

Data reduction
The data reduction is similar to the procedure developed in the first non-AO observations of the UDF . It is based on the MUSE pipeline (Weilbacher et al. 2020) and includes a number of new developments. A super flatfield is performed for each exposure by combining, without any recentering and derotation, a large number of observations obtained during the same run. This "superflat" is subtracted from each individual reduced datacube, all of which are then combined into the final datacube. Remaining sky subtraction residuals are removed with ZAP (version 2.0, Soto et al. 2016). The datacube propagated variance is rescaled to take the impact of noise covariance due to the interpolation process into account (see Bacon et al. 2017). In the region with more than 100 h depth, the average 5σ surface brightness detection limit at 7000 Å is 1.3 × 10 −19 erg s −1 cm −2 arcsec −2 for an unresolved emission line summed over 3.75 Å (3 spectral pixels) and 1 arcsec 2 (5 × 5 spaxels) and the corresponding 5σ point source limiting flux is 2.3 × 10 −19 erg s −1 cm −2 for the same emission line.
Because of the absence of bright stars in the field, we use the muse-psfrec software (Fusco et al. 2020) to estimate the spatial point-spread function (PSF) from the real-time Adaptive Optics telemetry information recorded by GALACSI. The PSFs of each observation are then combined to produce a measure of the final image quality. The PSF is modeled as a Moffat function (Moffat 1969) whose parameters (FWHM and β) change smoothly with wavelength. Thanks to the AO performance, an excellent image quality (Moffat FWHM) of 0.6 and β = 2.1 at 4700 Å to 0.4 and β = 1.8 at 9300 Å is achieved for this very deep exposure.

Source detection and classification
We perform two types of source detection and extraction: a blind source detection with ORIGIN and source extraction and deblending using HST prior information with ODHIN. The ORIGIN software (Mary et al. 2020) has been developed to automatically detect faint line emitters in MUSE datacubes. It is optimized for the detection of compact sources with faint spatialspectral emission signatures and provides an automated and reliable estimate of the purity (i.e., related to the proportion of false discoveries). The software was run with a purity threshold value of 0.8 and resulted in the detection of 2137 emission lines grouped into 1002 sources.
The ODHIN software (Bacher 2017) uses the higher spatial resolution provided by the HST images to perform deblending of sources in the MUSE datacube. The approach is similar to TDOSE (Schmidt et al. 2019) with the difference that it is nonparametric.
For the inspection we limit the ORIGIN sources to the inner 16+ h region (corresponding to an 80 diameter) to avoid an increase in the false detection rate at the edge of the field when the S/N decreases rapidly. This reduces the number of sources to inspect to 845. Similarly, only 389 HST sources from a total of 1387 within the MXDF field have been selected for inspection, using a S/N continuum cut of 0.8 per spectral pixel.
Evaluation of the redshift solutions provided by the Marz cross-correlation software (Hinton et al. 2016) is performed by three independent experts 2 . After reconciliation of the disagreement between experts, a final catalog of 733 sources with redshift, matching sources and confidence is produced.
95% of the Lyα emitters (LAEs) result from ORIGIN detections. In this case, an optimal extraction is performed on the raw datacube using the ORIGIN correlation pseudo narrowband image as a weighting map. While this will produce a more precise flux estimation than continuum based extraction for the case of extended Lyα emission, it will nevertheless miss part of the Lyα extended halo flux 3 .
Flux and equivalent width measurement of the Lyα line is performed with pyplatefit, a Python enhanced version of the PLATEFIT IDL software developed for the analysis of the SDSS survey (Tremonti et al. 2004;Brinchmann et al. 2004). Among the improvements implemented in pyplatefit that are of interest here, one can mention the asymmetric Gaussian (simple and double) line fit and the bootstrap method of evaluating robust errors.
The Lyα redshift is based on the peak of the Lyα line. We note that, because of the resonant scattering properties of the Lyα photons in the interstellar medium, the Lyα redshift is systematically different from the systemic redshift (e.g., Shapley et al. 2003;McLinden et al. 2011;Rakic et al. 2011;Song et al. 2014). Typical velocity offsets are ≈200 km s −1 for LAEs, with larger values (≈500 km s −1 ) for Lyman-break galaxies (Shibuya et al. 2014;Muzahid et al. 2020). As reported by Shibuya et al. (2014) and Muzahid et al. (2020), there is a strong anti-correlation between the Lyα velocity offset and the Lyα equivalent width. Given the high equivalent widths of our LAE sample (see Fig. 5) one can expect an average velocity offset 200 km s −1 .
For the vast majority of our LAE sample, there are no alternatives to Lyα redshifts: nonresonant emission lines such as CIII] 1907,1909 are outside the spectral window or too faint to derive a systemic redshift from and the continuum is too faint to measure any reliable absorption line or feature. A possibility is to use the empirical correlation found by Verhamme et al. (2018) between the FWHM of the Lyα line and the velocity offset. For double-peaked Lyα lines, the redshifts reported by the double asymmetric fits are measured from the averages of the two peak central wavelengths. As shown by Verhamme et al. (2018), this value is a better approximation of the systemic redshift than the Lyα peak location.
We note that very precise absolute redshifts are not essential for this paper given the width of the wavelength window used in the search (Sect. 3). No attempt is made to further correct the Lyα redshift to the systemic value. What matters, however, is the scatter of the Lyα redshifts around their median value at a given systemic redshift. Using the sample of 55 galaxies of Verhamme et al. (2018), we derive a scatter of ±95 km s −1 (95% probability percentile), which is smaller than the windows that we will use.

Source catalog
We combine the revised UDF-10 (30 h depth, 1 × 1 arcmin 2 ) and MOSAIC (10 h depth, 3 × 3 arcmin 2 ) catalogs (Inami et al. 2017;Bacon et al., in prep.) with the new sources discovered in the MXDF (Fig. 1). Selecting all sources with z > 2.9 gives a total of 1258 LAEs in the 9 arcmin 2 field of view of the HUDF observed with MUSE. We note that a low fraction of sources (55 that is 4%) are not strictly speaking Lyα emitters but are detected from their strong Lyα absorption, often associated with weak Lyα emission. These galaxies are brighter in the continuum than the overall population of Lyα emitters and are more typical of Lyman-break galaxies. While a few Lyα emitters have additional detected UV lines such as CIII] 1907,1909or CIV 1548,1550 , there is no source in the catalog without Lyα detection at z > 2.9.
In the deepest MXDF region (depth >100 h), the LAE density reaches 375 galaxies per arcmin 2 . It is the densest collection of LAEs ever obtained in a single field. To illustrate this point, one can perform a quantitative comparison with the large Lyα emitter narrowband survey performed around the SSA22 field by Yamada et al. (2012). The authors identify 2161 candidate Lyα emitters in the redshift range [3.062−3.125], leading to a LAE average surface density of 0.20 arcmin −2 . Using the same redshift range we obtain, thanks to the high sensitivity to low luminosity LAEs, a LAE surface density of 6.2 and 17.9 arcmin −2 in, respectively, the MOSAIC and 100+ h depth MXDF field of view.
The catalog covers a wide redshift range from z = 2.90 to 6.65 (Fig. 2), with a significant part at high redshift (346 LAEs at z > 4.8). Redshift confidence has been attributed by experts, from one (low confidence) to three (very high confidence). There are 402, 513, and 343 Lyα emitters with confidence one, two and three, respectively. The lower confidence LAEs are generally reliable detections (i.e., they are obtained with ORIGIN set at a high purity level) but with lower S/N Lyα line profiles where the Lyα redshift solution is the most probable. In this paper we use all LAEs irrespective of their redshift confidence.
An important fraction (40%, 504) of LAEs has no entry in the Rafelski et al. (2015) HUDF photometric catalog. For a small part (10%, 50) it has not been possible to select a unique HST counterpart among the few possibilities. In some other cases (6%, 30) an HST counterpart can be seen in some of the HST bands that was missed by the automatic identification performed by Rafelski et al. (2015). But for the vast majority of the 504 sources without a Rafelski et al. (2015) counterpart, namely 84% (423 objects) no detectable HST counterpart is observed. A previous study of 103 similar sources in this field, anterior to MXDF observations, shows that these sources are the high equivalent width tail of the Lyα emitter population (Maseda et al. 2018) and that they have on average a high ionizing photon production efficiency (Maseda et al. 2020).

Redshift overdensities of Lyα emitters
A visual inspection of the datacube shows that LAEs are not distributed uniformly but exhibit strong clustering in redshift space.
Here we focus on the detection and properties of LAE redshift overdensities. The method used is described in the following section.

Choice of coordinate system
Given our small field of view, finding overdensities is nothing more than performing a binning of the redshift distribution followed by a peak detection. There are, however, some subtleties to take into account. One of them is the choice of coordinate system which is strategic given the very wide redshift range (2.9−6.7) probed by our observations. A first possibility selected by Cohen et al. (1999) and Gilli et al. (2003) is to use bins in velocity space (c ln[1 + z]). A fixed bin in rest-frame velocity space has the advantage of sampling the Lyα line, independent of the Hubble flow. The sampling must be large enough given that the Lyα profile is generally broad and shifted in velocity 4 .
An alternative is to use a fixed physical scale which allows us to compare the sizes of physical objects independently of the redshift. In this coordinate system, a sampling of 2 Mpc corresponds to 600 km s −1 at z = 3 and 1400 km s −1 at z = 6. Given the small evolution in redshift of the Lyα line profile (Hayes et al. 2021), such a large spread of the sampling with redshift will be suboptimal for the diffuse emission signal extraction.
The last possibility is to use comoving coordinates. In this coordinate system, the cosmic web filaments keep their structural properties along the expansion of the Universe. The sampling evolution with redshift is also less important than for the physical scale: an 8 cMpc sampling translates into 610 km s −1 at z = 3 and 800 km s −1 at z = 6. We therefore selected this last 5000 6000 7000 8000 9000 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 Z 0 1 2 3 4 5 6 Overdensity Fig. 3. LAE overdensities (δ) found in the HUDF as a function of redshift. The top axis displays the corresponding wavelengths in Å. The noise spectrum is shown in blue on the reverse y-axis. Regions excluded from the overdensity search are displayed in pale blue. The 5800−5966 Å masked region is due to the sodium notch filter used to filter out the light from the bright AO Laser guide star. option, which is best-suited to the object of the study and is not far from optimal for signal detection.

Detection of overdensities
Although the search for diffuse emission will be done in the MXDF deep area, the search for overdensities is performed on the full LAE catalog, namely in the entire MOSAIC field of view. With an area ten times larger than in the deeper region of the MXDF, the 3 × 3 arcmin 2 MOSAIC field of view improves the chance to detect large-scale overdensities. The search is done in two steps: we start by identifying peaks in the redshift distribution using a wide sampling. Group redshifts are then refined with a finer sampling, and their overdensity is estimated by comparing the number of LAEs in the group with the expected mean value.
We computed the histogram of the LAE population in comoving space, imposing a constant step of 8 cMpc over the whole redshift range. Given the expected faintness of the signal, we aggressively masked all redshift bins with sky lines. This removes 46% of the bins, mostly at z > 5 (Fig. 3).
The background value (B) was estimated as the mean value of the number of LAEs in a 8 cMpc bin after 3 sigma clipping. It should be noted that the possible redshift dependance of the background was ignored in this preliminary first step 5 . For each histogram peak, its signal to noise ratio (S/N) was estimated as the peak signal (S ), that is the excess number of LAEs over the background and divided by the noise value S + σ 2 B , where σ B is the standard deviation of the background value. After some tests we selected a S/N cut of 2.5 and a minimum number of LAEs of seven. This resulted in 24 peaks.
Naturally, the number and redshifts of overdensities is a function of the input parameters: namely the S/N cut, the minimum number of members and the window size. We checked the sensitivity of the method by playing with these parameters. Increasing the S/N cut from two to three decreases the number of overdensities from 37 to 11. The number of overdensities is stable with the imposed minimum number of LAEs up to ten, and decreases rapidly from 24 to 15 for larger values of the minimum number of members. The selected redshift window size of 8 cMpc is optimal: The number of detections decreases from 24 to 21 and from 24 to 15 when the window size is changed to 6 and 10 cMpc, respectively.
We did not try to further optimize the detection parameters given that it is not our goal to have an exhaustive list of overdensities, but to find the most overdense regions within a redshift window size suitable for the search for diffuse Lyα emission (i.e., large enough with respect to the Lyα line width but not too large to dilute the signal). We therefore proceed to use the optimal 8 cMpc window and the S/N cut of 2.5, which is a good compromise in the number of overdensities to explore.
In a second step we refined the group measurement for the selected peaks using a histogram with a finer grid of 50 kpc sampling. The 8 cMpc size of the window was kept fixed, but its center was slightly adjusted around the initial peak value to maximize the number of LAE members. Groups overlapping in redshift were merged into a single fixed 8 cMpc size window. This reduced the number of overdensities by two. The final group catalog, given in Table C.1, is composed of 22 overdensities. Groups have on average 17 members with a minimum of ten and a maximum of 26.
To estimate the overdensity, that is the ratio of the LAE number density in groups with respect to the mean of the overall population, we first computed the evolution of the mean LAE density with redshift. We partitioned the 2837 sky free redshift bins into 5 redshift bins with a similar numbers of LAEs and computed their mean LAE densities. The mean density decreases approximately linearly with redshift from 0.0201 cMpc −3 in the z = [2.86−3.45] bin to 0.0063 cMpc −3 in the z = [5.75−6.65] last bin.
For each group we can then estimate its overdensity (δ) by comparing the number of group members (N lae ) to the number of expected LAEs using their mean density (ρ lae ) and the group volume (∆V = 8 cMpc × S ), with S the area of the MOSAIC field 6 : δ = N lae /(ρ lae ∆V). The group overdensities are shown in Fig. 3 and the corresponding spatial locations of LAEs in the groups in Fig. 4. The group numbering follows the redshift. There are 370 LAEs in groups, that is 29% of the total LAE sample. The mean overdensity is 3.2 and the densest group is at z = 4.5 and has δ = 5.0.    Figure 3 shows a clear trend with redshift: 2/3 of the groups are found at z < 3.8, in 1/3 of the free-sky comoving total accessible volume. We examine the correlation of overdensities with the AGN population. The 7 Ms Chandra Deep Field South (CDFS) catalog (Luo et al. 2017) has 65 identified AGN at z > 2.9 in the full area (484 arcmin 2 ). We found ten overdensities with AGN at similar redshift (∆z < 0.01) within the CDFS area. As might be expected, a significant fraction (45%) of overdensities harbor an AGN. However, the UDF area is a tiny fraction (2%) of the total CDFS area and there are only two overdensities with an AGN located within the UDF: group 2 at z = 3.07 and group 6 at z = 3.19.

Properties of overdensities
As shown in the first panel of Fig. 5, LAEs in overdensities are low-luminosity Lyα emitters: log(L Lyα erg s −1 ) = 41.5 ± 0.4. We also display in panel 2 the Lyα rest-frame equivalent width (EW) for the subset of Lyα emitters that are bright enough in their continuum to derive a meaningful equivalent width (i.e., with an EW S /N > 3). We note that for the majority of Lyα emitters (65%), their continuum is too faint, and thus we can only derive lower limits for the equivalent width. As already pointed out, many of these galaxies are so faint in the continuum that they are not even detected in the deepest HST images (Maseda et al. 2018).
An important fraction (69%) of group members have an entry in the HST Rafelski et al. (2015) catalog. We use the exquisite ancillary information provided by the HST photom-etry to derive additional physical parameters for the groups. The stellar mass, SFR and specific SFR (sSFR) are inferred via SED fitting using the high-z extension of the code MAG-PHYS (da Cunha et al. 2008(da Cunha et al. , 2015 and enabling for a minimum stellar mass of 10 6 M (see Sect. 3.2 of Maseda et al. 2017). We use HST photometry from the UVUDF catalog of Rafelski et al. (2015) which comprises WFC3/UVIS F225W, F275W and F336W; ACS/WFC F435W, F606W, F775W, and F850LP and WFC/IR F105W, F125W, F140W and F160W 7 On average, the LAEs in overdensities are low-mass (1.4 × 10 8 M ), young (0.3 Gyr) galaxies with high specific star formation rates (SFR > 0.4 M yr −1 ; sSFR > 10 −8.5 yr −1 ), higher than the typical values at these redshifts (Schreiber et al. 2015;Salmon et al. 2015).
Regarding the remaining population (31%) of LAEs not detected in the HST images, we cannot derive quantities like SFR or stellar mass for those galaxies without robust continuum detections from HST. However, as demonstrated by Maseda et al. (2018Maseda et al. ( , 2020, this selection yields faint (M UV ≈ −15), star-forming (β ≈ −2.5) galaxies on average. While we cannot directly determine the stellar masses from the UV continuum alone, an extrapolation to the Duncan et al. From left to right columns: log Lyα luminosity in erg s −1 , the rest-frame log Lyα equivalent width in Å, the log stellar mass in M , the log star formation rate in M yr −1 and the log age in years. Except for the Lyα luminosity, which used all LAEs, only subsets are used for the Lyα equivalent width (35%) and for Mass, SFR, and Age (69%). See Sect. 3.3 for the subsample definition. Top row: histograms of these properties. Bottom row: corresponding global properties for each group (GID): the blue circle symbols show the group medians and the 25% and 75% percentiles, the orange symbols are the summed property for all galaxies in the group. The median value for each property for the full sample is shown as a blue line in each panel. The GID increases with redshift. stellar masses below 10 7 M at these redshifts. Using this average mass, we estimate the contribution of the HST-undetected LAEs to the total stellar mass of the overdensities to be on average ≈1%, with a maximum of 7% for group 20. Figure 5 displays the main properties derived from the SED fitting, in addition to the Lyα luminosity and equivalent width.
We estimate the individual dark matter halo masses associated with each group member with an HST counterpart using the galaxy stellar masses derived above and the stellar-to-halomass relation (SHMR) derived by Girelli et al. (2020). The overdensity mass is then the sum of the halo masses of each galaxy. We note that the observational data used by Girelli et al. (2020) to constrain their SHMR model only extend out to z ≈ 4 and consist of galaxies generally more massive than our sample (M star > 10 10 M at z = 4). However, Girelli et al. (2020) show that their SHMR at z = 3 is between the estimates of Behroozi et al. (2019) and Moster et al. (2018) who use deeper data. We thus chose the fits from Girelli et al. (2020) as a middle guess, and we note that the errors on the SHMR are probably on the order of a factor of two, which is small with respect to uncertainties on our estimates of the stellar masses. Our results are shown in Fig. 6 where we see an average halo mass ∼10 11.3 M , and almost no halos more massive than ∼10 12 M . This is typical of the LAE population we survey (e.g., Garel et al. 2015a), and our groups are likely progenitors of galaxies like our own (Garel et al. 2015b). We note the exception of group 2, which contains one very massive halo of 10 13.5 M , outside the MXDF region. These massive environments, typical of proto-clusters, are known to host AGNs.
In the next section we use the discovered overdensities to search for extended Lyα emission.

Extended Lyα emission
Numerical simulations predict the Lyα emission to occur along filaments that are 50−100 kpc wide and a few megaparsecs long (see for example 1−10 × 10 −20 erg s −1 cm −2 arcsec −2 , depending on the halo mass (e.g., Gould & Weinberg 1996). Such a low surface brightness will not be detectable in the 10 h deep MOSAIC field, so we restrict the search to the MXDF region.
In each overdense region, the MXDF volume is a cylinder of 2.5 cMpc diameter in the transverse direction and 8 cMpc along the line-of-sight. Our goal is to detect faint Lyα emission in between the galaxies.
To perform this detection we developed a two-step method. In the first step, we start by performing a segmentation of the narrowband signal-to-noise (S/N) image to isolate the regions of extended emission. In the second step we compute the Lyα flux and its error in these segmented areas using the continuumsubtracted narrowband image. Extended Lyα emission is marked as detected if the computed S/N and the area of the extended region are large enough. Here that means the estimated probability that noise can explain such a S/N value by chance is sufficiently low.
We note that the two steps are somewhat independent. Alternative methods with different signal transformation to isolate the extended emission could be used to identify diffuse emission segments. Whatever the method, the flux and error computation are always performed in the original continuum subtracted narrowband image. The details of the methodology is described in the following sections.

Multiscale analysis of narrowband images
Detection and estimation of diffuse emission in the presence of sources that are locally two orders of magnitude brighter is a difficult signal processing problem. The additional complexity is that Lyα emitters are themselves extended and thus cannot be considered as point sources.
The geometry of the problem calls for a multiscale approach. An interesting tool in this respect is the wavelet transform and, given the isotropy of astronomical sources, in particular the Isotropic Undecimated Wavelet Transform (IUWT, Bijaoui et al. 1994;Starck et al. 1998). Compared to classical Gaussian filtering, this wavelet transform allows for a separation of the sources according to its spatial scale. In the classical version of the IUWT filterbank, the original image can be directly resynthesized by the simple sum of the multiscale approximation coefficients. This feature is very interesting for detection or denoising approaches operating scale by scale (Starck et al. 1998). Such wavelet signal decompositions have been successfully used in astronomy for various applications: for example X-ray source detection (Finoguenov et al. 2020), diffuse light study in compact group of galaxies (da Rocha & Oliveira 2005) and the estimation of faint and diffuse radio components (Dabbech et al. 2015;Ammanouil et al. 2019).
For each overdensity, we started by building a narrowband S/N image by summing the continuum subtracted S/N cube over a window centered on the mean group velocity with a width corresponding to 8 cMpc (see Sect. 3). This corresponds to nine and 18 wavelength channels at z = 3 and z = 6, respectively.
As the continuum subtracted datacube we use the PCA subtracted S/N cube provided by ORIGIN. The ORIGIN process (Mary et al. 2020) is efficient for removing the continuum of bright galaxies without leaving many artifacts, which can be problematic for low surface brightness detection.
We then decomposed this narrowband image with the classical perfect reconstruction filterbank of the IUWT (Bijaoui et al. 1994;Starck et al. 1998Starck et al. , 2007. In the considered setting, the IUWT decomposes the image into eight component images 8 , from high to low frequencies. The high frequency noise mostly lives in the first two scales, bright and compact sources are well captured by the next two scales and the diffuse, large scale components of the image mostly live in the four remaining scales. The angular size of the spatial structures captured in each band is determined by the impulse responses of the IUWT analysis filters, whose FWHM are respectively in the range [0.2−0.6] arcsec (high frequency band), [0.6−2.2] arcsec (medium frequency band) and [2.2−37] arcsec (low frequency band). As shown in Fig. 7 for group 2 at z = 3.07, this process filters out the noise from the signal (high freq. panel) and cleanly separates the LAE signal (medium freq. panel) from the largescale diffuse structure (low freq. panel).

Shape and flux of Lyα diffuse emission
We used the IUWT S/N images obtained by the process described above to separate the signal of the extended Lyα emission from the Lyα emitter signal.
We started to identify compact sources by segmenting the medium frequency images using a threshold of 3.5σ (right upper panel of Fig. 7). We used this segmentation map to mask all outliers and compact sources in the original S/N image, replacing the mask values by an average S/N value obtained from a first estimate of the diffuse components. This first crude estimate is the average value of coefficients above a threshold of 1.5σ in the low frequency S/N image. We then again applied the IUWT decomposition to this S/N masked and filled image. This iteration removed most of the low frequency signal due to the compact sources (see lower left and central panels of Fig. 7).
We then performed a segmentation of the resulting low frequency image, using a threshold of 1.5σ (lower central panel of Fig. 7). We discarded all segments that have a size smaller than 200 pixels (i.e., 8 arcsec 2 ). This last step is motivated by the desire to keep only contiguous surfaces that are extended with respect to the PSF size (≈5 × FWHM).
At the edge of the MXDF field, the exposure time drops rapidly from 140 to 10 h. Besides having much lower S/N, this outer zone is also subject to larger systematics since it results from a smaller number of individual exposures. In this region, the probability of false detection is thus much larger. Therefore, we discarded all segments that have a mean depth (averaged over the full segment area) less than 60 h (see the corresponding location in the lower right panel of Fig. 7).
To avoid potential pollution by low-z emission line interlopers, we identified all galaxies in the full source catalog with emission lines other than Lyα having S /N > 5 and inside the group window. Any interloper which falls inside the filaments was masked out for the filament's flux estimation.
The final segmentation (lower right panel of Fig. 7) is composed of a series of extended structures, called "filaments" in the rest of the text. We also identified the area covered by the compact sources detected in the first step within each filament. The remaining area within the filaments is called the diffuse extended emission in contrast to the compact source area. We will use this terminology in the rest of the document, referring to filament, compact and diffuse area.
These segmentations obtained in the wavelet space are then used to compute, in the data space, the total flux of each area. The computation is performed on the original narrowband images, obtained from the ORIGIN PCA continuum-subtracted datacube.

Noise and S/N estimation
Although the datacube noise properties have been carefully validated, including the impact of noise covariance, one cannot rely on formal noise propagation when working at such faint surface brightness especially after continuum subtraction processes, which may modify the noise distribution. Any remaining systematics like low background fluctuations can have a significant effect when summing the signal over a large area.
In order to estimate the standard deviation of our flux measurement, we have performed the following computation: We mask all spaxels contained within the filament segments or within the compact source segments identified in the first step (see upper and lower right panels of Fig. 7). Depending on the group this masks 10 to 30% of the total area. We start by computing the average offset and its standard deviation of the main unmasked area (1 diameter or 120 h depth). This mean offset is subtracted from the flux image. The computed offsets are always small: −0.9 ± 1.2 × 10 −20 erg s −1 cm −2 arcsec −2 on average for all groups. We then perform the following bootstrap experiment: we sum the flux of N spaxels selected by random permutation of the list of unmasked spaxels. N is the number of spaxels of the segmented area. At each iteration, we add a single offset flux value randomly drawn from a Normal distribution with zero mean and standard deviation equal to the previously computed offset error. This allows us to take into account the uncertainty of the subtracted offset value. We repeat this 1000 times and compute the standard deviation of the sum values. This gives the empirical noise estimate for the sum of N spaxels. We perform this experiment for 3 different values of N, corresponding to the number of spaxels of the segmented area covered by the entire segments, the compact sources and the diffuse area, respectively.
The standard deviation derived from the bootstrap experiment is then scaled to take into account the correlation of the noise introduced during the data reduction process (see Sect. 4.6 of Weilbacher et al. 2020). This scale factor, estimated during the data reduction, is slowly evolving with wavelength, with values in the range 1.9−2.1. This results in our empirical standard deviation of the flux values for the filament, compact and extended segmented area.
This empirical noise estimation is performed for each overdensity. We compare the computed values with the ones derived from noise propagation, using the values given in the datacube (already corrected for noise correlation). We note that, even after taking the noise correlation into account, the propagated values underestimate the noise by a factor two (2 ± 0.3). This is likely due to small systematics left by the continuum subtraction. In the rest of the document the S/N is defined as the flux divided by the standard deviation of the noise estimated empirically as described above.

Process validation
While the empirical error computed in the previous section gives an estimate of the expected error for the measured flux within a given diffuse emission segment identified by the algorithm, it says nothing about the probability of being fooled by noise, in other words, the probability of estimating a diffuse emission segment with the same or higher S/N when there is no diffuse emission. For each narrowband, the S/N value of the diffuse emission computed by the estimation algorithm can be used as a test statistic to discriminate between a null hypothesis (there is no diffuse emission segment) and the presence of diffuse emission. Owing to the nature of the data and to the various preprocessing leading to the computed S/N values, the distribution of these test statistics under the null hypothesis is unknown and in particular if this is not a Gaussian as we will see shortly. One must consequently resort to Monte Carlo simulations to evaluate this distribution in order to quantify the significance of our findings.
For that purpose we perform the following experiment. We select all wavelength slices free from sky pollution and outside the 22 groups' wavelength regions. This effectively leaves 2582 wavelengths among the 3721 cube wavelengths. We then perform a random permutation in wavelength of this sample and split it into 258 groups of ten slices. The random permutation breaks the spectral continuity ensuring that no genuine extended structure at other redshifts than that of the overdensities can add coherently in the groups of shuffled slices. In contrast, we expect that the contribution of some diffuse emission in contiguous slices might, even if they are very faint in each slice, combine to something detectable in some of the 22 groups under examination. No spatial permutation is attempted because this would break the structure of numerous emission line sources that are present in all spectral regions of the datacube. These sources are also present in the groups under examination and even if the algorithm attempts to remove them in order to estimate diffuse emission, they do affect the algorithm's results. The estimation algorithm (i.e., wavelet transformation, segmentation and empirical noise estimation) is then applied to each group and the resulting S/N of the diffuse emission segment are saved. We repeat this process 50 times in order to increase the size of the control sample to 12 900 groups. From the resulting statistical distribution of the S/N, one can estimate the P-value for a given overdensity (P k ). The P-value P k , associated with a S/N value S /N k , is the probability of obtaining a S/N value equal or larger to S /N k when there is no diffuse emission. We estimate the P-values by counting the empirical fraction of control groups with S /N > S /N k , where S /N k is the S/N of the overdensity under examination (k = 1, . . . , 22). In practice, to take into account the increase in the wavelength range with redshift (from 9 to 18 slices), we repeat the experiment with a similar sample of 12 900 groups of 20 wavelength slices. The resulting P-values as a function of S/N are given in the left panel of Fig. 8. Further validation of the method using independent data sets and simulations are presented in Sects. 4.6 and 5.2.2, respectively.

Extended and diffuse Lyα emission in overdensities
The detection process described in Sect. 4.2 has been applied to the 22 groups. The resulting images are displayed in Fig. 9. As shown by this figure and the left panel of Fig. 10, extended emission 9 is found in most of the overdensities. If we arbitrarly 9 The extended emission refers to the total flux within the filaments including the compact sources area, whereas the diffuse emission refer to the filament area after exclusion of compact sources. set a lower limit of 300 arcsec 2 (or 1.9 × 10 4 pkpc 2 at z = 3) as the minimum area, we have 14 groups (63%) with "extended" emission. The most extended group is group 5 with an area of 1008 arcsec 2 or 6 × 10 4 pkpc 2 , that is 20% of the total available MXDF area. We note that groups ten and 15 show almost no extended emission. As we will see later in Sect. 5.2.2, this is likely due to the size difference between the MOSAIC and MXDF fields of view.
We now review the diffuse area identified within each filament (central panel of Fig. 10). To evaluate their statistical significance, we translate their S/N to a P-value by using the relation between P-value and S/N derived from the control sample ( Fig. 8 left panel). For each overdensity we linearly interpolate the P-values at the corresponding number of wavelength slice of the group. The computed P-values are given in the right panel of Fig. 8. We split the groups in two subgroups: the high confidence overdensities (confidence 1) with diffuse emission signal detected with a P-value smaller than 0.06, and the rest of low confidence overdensities without statistically significant detection (P-value > 0.06, confidence 0). Among the 14 groups with extended Lyα emission, we have five groups with diffuse emission detected at high confidence: two at z ≈ 3 (groups 2 and 5) and three at higher redshifts (groups 17, 18, 19 at z ≈ 4.5). It is worth mentioning that the diffuse Lyα emission in the high redshift groups is detected around 7000 Å, at the peak of MUSE sensitivity where the noise is minimal (column SB of Table C.1).
It should also be pointed out that group 6, which is globally the brightest in Lyα flux and one of the most extended (left panel of Fig. 10), is classified as low confidence (P = 0.3) for diffuse Lyα emission. The reason is that the flux and area attributed to compact LAEs prevail over its diffuse emission. The S/N for the diffuse area flux (central panel of Fig. 10) is then too low to pass the P-value threshold (P < 0.06). For similar reasons, groups 1, 7 and 20 were ranked as low confidence despite their total flux brightness and area.
The average surface brightness of the diffuse emission in the high confidence groups is 5.1 ± 1.2×10 −20 erg s −1 cm −2 arcsec −2 . A statistical overview of the parameters of the high confidence filaments is given in Table 1 while the details for each overdensity are given in Table C We note that, while the algorithm results in the detection of extended and diffuse emission with a given confidence, it does not tell us if this extended emission is indeed Lyα emission. There are other possible sources of diffuse emission in the datacube due to other emission lines such as MgII (Burchett et al. 2020;Wizotski, in prep.;Leclercq, in prep.), [OIII] (Johnson et al. 2018) or even galactic diffuse emission (e.g., Hα, [SII]). However, the former are systematically associated with bright continuum galaxies that are not present in the identified overdensities and the latter are at known wavelengths and can thus be easily excluded. In addition, the fact that the identified diffuse emission overlaps spatially with the known Lyα emitters found independently during the construction of the catalog (Sect. 2.4), suggest that this diffuse emission originates from Lyα emission.
The boundary between compact and extended sources is based on the considered multiscale analysis. An important question is to what extent the segmentation performed on compact sources is representative of the galaxy itself only, or of the galaxy plus its surrounding CGM. In other words, we want to determine what fraction of the filament flux can be explained by the CGM of the identified galaxies. In a few cases, for example group 5, the question is not relevant given that a high fraction of the filaments have no detected LAEs. But in some cases, such as group 2, there are two bright LAEs in the main filament that might bias  We note that group 10 has no compact source detected within the filament area. Notes. Mean, standard deviation, min, and max values are given for the following three components (c): fil = full filament, comp = compact area within the filament, dif = diffuse area within the filament. the filament flux measurement. In Fig. 11 we show that the area of group 2 covered by compact source segments extends up to 27 pkpc from the galaxy center. At such a distance, a very high fraction of the galaxy CGM is already included (see Fig. 15 of Leclercq et al. 2017) and the average surface brightness of the Lyα halos falls below 10 −20 erg s −1 cm −2 arcsec −2 (see Fig. 2 of Wisotzki et al. 2018). We then expect the CGM of these Lyα emitters to have only a small contribution to the measured Lyα emission flux in the diffuse area.
In addition, as shown in Fig. 11, the compact source segmentation map (red contours) is not restricted to the detected LAEs, but covers all medium frequency peaks identified in the NB image. We also note that the flux of these compact sources is on average three times fainter than the Lyα flux of the detected LAEs (Fig. 13). We conclude that our extended diffuse emission flux is a conservative lower limit on the Lyα emission shining outside the CGM of identified LAEs.
For each group we compute the total Lyα luminosity in the diffuse area and compare it to the flux measured in the compact structure area (right panel of Fig. 10, Table C.2). For most of the groups, the measured Lyα luminosity in the diffuse area is significantly larger than in the compact source area. This is true in particular for the five high confidence groups. As seen above, if we assume that the compact source flux is a good proxy of the total Lyα emission of identified galaxies, including their CGM emission, we conclude that a high fraction of the total Lyα emission measured in the filaments is coming from either undetected faint Lyα emitters and their CGM and/or intrinsic diffuse Lyα emission. For the high confidence groups the average log Lyα luminosity (erg s −1 ) in the diffuse area amounts to 42.6 ± 0.1 and represents between 61% and 79% (average 68%) of the total filament Lyα luminosity.
4.6. The case of the z = 3.07 overdensity Although we did not expect to detect diffuse Lyα emission other than the Lyα halos in the MOSAIC and UDF-10 exposures with 10 and 30 h depth, respectively, we nevertheless check for possible exceptions in the high confidence groups. The estimation algorithm shows no convincing emission in all groups with the notable exception of group 2, which displays a clear signal extending the main MXDF filament to the south. Figure 12 displays the complete structure which now extends to a total length of 4.6 cMpc (1.1 pMpc) with a width of 191 ckpc (47 pkpc).
As shown in the inset of Fig. 12, the filament is detected in each of the three independent exposures. This observational evidence constitutes an additional process validation and reinforces our confidence that the structure is real. The MOSAIC and UDF-10 exposures show a second filament crossing the main one at 45 • .
A bright patch of Lyα emission is seen at the south end of the filament (bottom of Fig. 12). The source of Lyα emission coincides with a luminous, QSO-level type II AGN identified as ID 746 in the Chandra 7 Ms catalog (Luo et al. 2017), hereafter referred to as CID 746. The extended Lyα emission surrounding the AGN was already reported by den Brok et al. (2020) in their study of extended Lyα emission around type II AGN. The authors trace extended Lyα out to 80 pkpc from the AGN (see their Fig. 1). We note that their analysis is derived from the same data set (i.e., the MOSAIC MUSE datacube), but with a different post-processing. Our analysis confirms this extension and adds the finding of fainter diffuse Lyα emission at much larger distance. We consider the possible importance of this AGN in Sect. 5.1.2 below. Figure 12 is also illustrative of the density increase of compact sources seen at the MXDF depth with respect to the MOSAIC field. Of course we are not able to prove that all these compact sources are indeed faint Lyα emitters, and some of them are possibly noise peaks, but the fact that a low fraction of them were independently identified as Lyα emitters during the construction of the catalog suggests that the filaments are populated by a high number of faint Lyα emitters. This property is not specific to group 2, but can be seen in the other groups as well (Fig. 9).

Analysis and discussion
Part of the Lyα emission that we detect overall undoubtedly comes from galaxies and their CGM. Emission from the CGM may be powered by a variety of processes, probably combined: dissipation of gravitational energy through cooling radiation, fluorescence from the UV background or from ionizing radiation emitted by local sources, scattering of galactic Lyα through neutral gas in the CGM, extended star formation, or undetected satellite galaxies. Regardless the origin of this emission, we have shown that our analysis separates this signal from the diffuse emission (see for example Fig. 11). It is the origin of this diffuse component, which represents the largest fraction (70%) of the flux, that we try to explain in the present section. Possible sources for this emission are: hydrogen gas heated and ionized by the external UV background, undetected faint Lyα emitters, or gravitational compression. We explore these options in the following sections.

Lyα fluorescence by the cosmic UV background
Optically thick clouds illuminated by a diffuse intergalactic UV radiation field will emit Lyα emission by the fluorescent conversion of H-ionizing photons into Lyα (Hogan & Weymann 1987;Gould & Weinberg 1996). A ubiquitous source of such photons is the cosmic UV background (UVB), namely the integrated UV emission from AGN and star-forming galaxies (e.g., Haardt & Madau 1996. Here we follow the recipes from Cantalupo et al. (2005) to estimate the expected Lyα surface brightness due to UVB fluorescence, in a similar way as in Gallego et al. (2018). We adopt the photoionization rate of neutral hydrogen Γ HM H i and its redshift evolution predicted by the synthesis model of Haardt & Madau (2012) as a baseline value. To assess the uncertainties we also consider the more recent model calculations by Faucher-Giguère (2020)    in Lyman limit systems, that is, in optically thick clouds with H i column densities above 10 17.2 cm −2 with a covering fraction of 1. As shown in Table 2, the fluorescent Lyα emission powered by the UVB cannot explain the totality of the observed surface brightness, but its contribution is not negligible at z ≈ 3, with ≈30% of the total surface brightness. At higher z, the contribution falls to ≈10% (or 20% if we use the Becker & Bolton 2013 photoionization rate). It is important to stress that these values are upper limits as they assume a covering fraction of 1 for the Lyman-limit systems.
We note, however, that the photoionization rates given in Table 2 are average values. Within the overdensities, given the higher number of Lyα emitters, one might expect the local ionising radiation field to be enhanced relative to the background value. However, because the mean free path for ionizing photons is much greater than the size of the overdense region, the typical local enhancement of the ionizing flux is expected to be much smaller than the local overdensity of sources of ionizing radiation (e.g., Schaye 2006). The cosmological radiative transfer simulations performed by Rahmati et al. (2013) confirm that local sources only dominate over the UVB on very small scales. Their Fig. 4 (left panel) displays the various contributions to the photoionization rate as a function of distance from the centers of 10 10.5 −10 11 M dark matter halos at z = 3. One can observe that the contribution of local stellar radiation (Γ LSR H i ) is already a factor of five below Γ UVB H i at the virial radius and then drops to negligible values at larger distances. Given that the measured diffuse Lyα emission excludes by construction most of the local source's CGM (Sect. 4.5), one can then safely ignore the contribution of local ionizing radiation to the Lyα fluorescence in the diffuse emission area.

Lyα fluorescence from active galactic nuclei
In the vicinity of a luminous AGN, the intense UV radiation can boost the Lyα emission by large factors (Haiman & Rees 2001;Cantalupo et al. 2005;Kollmeier et al. 2010). Two high-redshift AGN are known within the MOSAIC field of view, both of which are outside the MXDF but match the redshifts of two of our groups. The spectacular case of group 2 at z = 3.07 was already mentioned in Sect. 4.6; the other AGN is associated with group 6 at z = 3.19, immediately adjacent to the MXDF footprint. We now consider the importance of AGN for explaining the observed diffuse Lyα emission in the MXDF, beginning with group 2.
While an AGN as luminous as CID 746 is intrinsically a copious producer of UV radiation, the object is classified as a highly obscured type 2 AGN. In order to produce any fluorescent emission in the MXDF filament the obscuration must be negligible in the transverse direction toward the MXDF filament, which (in line with den Brok et al. 2020) we assume in the following. We consider two simplified scenarios: If the neutral hydrogen column density is high enough to make the MXDF filament fully self-shielded and if we at the same time assume negligible absorption between the QSO and the filament, its Lyα surface brightness will scale directly with the geometrically diluted intensity of the incident UV radiation. We can then estimate a boost factor, defined by Cantalupo et al. (2005) as the enhancement of UV illumination due to the AGN and quantified in their Eq. (14). The northern extension of the Lyα nebula at 80 pkpc from the AGN has a surface brightness of 1 × 10 −18 erg s −1 cm −2 arcsec −2 , which translates into a boost factor of 50 if we adopt a baseline surface brightness of 2 × 10 −20 erg s −1 cm −2 arcsec −2 for fluorescence at an optically thick H i cloud due to the UVB alone (see Table 2 in Sect. 5.1.1).
This boost factor decreases with the square of the distance to the AGN, and at the MXDF location in 700 pkpc its value becomes 1+50×(80/700) 2 ≈ 1.6. We note that this is an upper limit given that the 3D distance is likely to be larger than the projected separation. In this scenario, the extra diffuse Lyα emission within the MXDF region caused by the AGN would fall short of reproducing the measured value by a factor 4 (see Table C.2).
If on the other hand the hydrogen within the megaparsecscale surroundings of the AGN including the MXDF region is optically thin to ionizing radiation, it is less straightforward to estimate the emergent Lyα surface brightness. Judging by the substantial X-ray luminosity of the AGN, the near-zone around the AGN should be much larger than the ∼1 pMpc distance relevant here (Bolton & Haehnelt 2007;Eilers et al. 2017), implying that the Lyα surface brightness enhancement is dominated by radiative recombinations and scales with the square of the local density. In this case it would indeed be possible that the AGN provides most or all of the additional photons needed to match the observed Lyα emission. Since we have no way to distinguish between these scenarios, we must leave the final judgement open for this particular group.
In case of group 6, the AGN is located so close (≈5 arcsec) to the region of diffuse emission that, although the AGN is much fainter, the emission can be explained with both scenarios. We note, however, that this group was ranked as low confidence for diffuse emission (see Sect. 4.5). In addition, one can observe in Fig. 9 that the diffuse emission is not specifically strong at the immediate vicinity of the AGN.
In the other overdensities with extended emission we did not find any evidence for any nearby AGN, neither based on characteristic features in the MUSE spectra nor from the Chandra catalog (Luo et al. 2017). Given the extraordinary depth of this catalog, this implies that we can confidently exclude any further strong AGN in this region. Low-luminosity AGN below the Chandra flux limit or very highly obscured objects -possibly even Compton-thick ones -are still possible, but these would not be significant contributors of escaping UV photons capable of powering the observed diffuse Lyα emission. Overall we conclude that AGN could boost extended Lyα emission in two of our overdensities, but that they are unlikely to be important in the majority of cases.
Finally, we note that AGN are expected to flicker on timescales of ∼10 5 yr (e.g., Schawinski et al. 2015), in which case the presence or absence of an AGN may not be correlated with AGN-induced Lyα emission at distance >100 pkpc.

Contribution of undetected Lyα emitters to the diffuse Lyα emission
In spite of the depth of the MXDF observations, we expect a significant fraction of LAEs to fall beyond our detection limit. It is then indisputable that at least a fraction of the observed diffuse Lyα flux is coming from these undetected LAEs. We now investigate if these galaxies can be responsible for the totality of the measured flux. In Sect. 5.2.1, we first address this question with simple modeling based on the luminosity function of LAEs. In Sect. 5.2.2, we then use a more sophisticated approach based on the semi-analytic model GALICS (Garel et al. 2012(Garel et al. , 2015a.

Luminosity function toy model
The luminosity function (LF) of LAEs is not observed to evolve strongly between z = 3 and 6 (e.g., Ouchi et al. 2008;Cassata et al. 2011;Herenz et al. 2019), and it is well described by a Schechter (1976)  6 as our fiducial model, but we will also explore later different values of α. Assuming to first order that the amplitude of the LF changes with environment but not its shape, we may write the mean surface brightness (SB) contributed by undetected LAEs as: where φ is the field LF, δ is the over-density measured in each group (Table C.1), and the integral runs over luminosities below a luminosity L det corresponding to a flux limit of 10 −18.5 erg s −1 cm −2 at the redshift of each group. This flux limit is derived as the peak of the Lyα flux distribution (Fig. 13) of compact sources identified by the detection algorithm (Sect. 4.2). ∆l = 8 cMpc is the depth of the narrowband slice, and ζ ≥ 1 is a factor that accounts for angular clustering. We define ζ as the inverse of the filling factor of LAEs in a field: ζ = S MXDF /S LAE where S MXDF 3632 arcsec 2 is the projected area of the field and S LAE is the area occupied by diffuse emission from unresolved LAEs. A value ζ = 1 corresponds to sources uniformly distributed, and a higher value means a stronger angular clustering of sources. At the scales involved in the present study (<1 arcmin), the angular clustering of LAEs is not constrained. Moreover, clustering constraints are statistical by nature and would not allow us to predict a value of ζ group by group. We thus estimate a value of ζ for each group as ζ = S MXDF /S Fil , where S Fil is taken from Table 1. This assumes that all undetected galaxies are distributed within the observed diffuse emission mask. This is probably an upper estimate of ζ as one expects fainter sources to cluster less strongly (e.g., Mo et al. 1998) and hence to occupy a larger area. The values of ζ we obtain in this way range from 4 to 8 depending on the group, with a mean of 6.4. The results of this simple model, with the parameters set to the values discussed above, are shown in Fig. 14 with the blue plusses. This figure shows the ratio of the expected Lyα surface brightness due to undetected galaxies to the one measured in the diffuse component (SB dif from Table C.2) for each high confidence group. We find that undetected galaxies account for about 40% of the signal in groups 2 and 5, about 55% for group 17, and 80−90% for groups 18 and 19. Thus, for most groups studied here, the contribution of undetected LAEs is likely dominant. If we add the contribution from fluorescence of the UVB (Table 2), we can explain about 70% of the signal for groups 2, 5 and 17 and 90−100% for groups 18, 19.
Of course these estimates are crude at best and very uncertain due to many poorly constrained parameters. The steepness of the faint end of the Lyα LF, in particular, is the parameter that has the largest impact 10 . Assuming a steep, yet reasonable, slope α = −1.94, we find that the expected SB due to undetected LAEs actually overshoots our observations by factors ≈2−3 (top blue symbols in Fig. 14). From Eq. (1), it is clear that this tension may easily be resolved by reducing the product ζ × δ by a factor of three. As mentioned above, our fiducial value of ζ is an upper limit and the true value may be anywhere between 1 and our estimate, either due to small-scale clustering variance or to projection effects. The amplitude of the multiplicative ζ correction is shown as an arrow in Fig. 14 for each group. The value of δ, in turn, is evaluated on the full extent of the MOSAIC field, and it is not obvious that the same over-density would be measured within the smaller MXDF area. It is interesting to note that a shallower slope of the LF, say α = −1.74, produces at best 30% of the observed diffuse emission (lower blue symbols on Fig. 14). This would imply that the diffuse emission we observe is for the most part not due to stellar irradiation. The precise value of the faint-end slope of the Lyα LF is thus crucial to understand our observations. Its uncertainties are still large, although a steep slope appears more likely (Herenz et al. 2019). Given the potential role of faint LAEs in producing the diffuse emission we detect, it is interesting to understand down to which luminosities the contribution of these LAEs matters. This depends on the slope of the LF: the steeper the LF the more important the contribution of very faint objects. As an example, for the slopes α = −1.74, −1.84, and −1.94, 50% of the luminosity below L det is contributed by sources fainter than 0.07 × L det , 0.01 × L det , and 10 −5 × L det , respectively. At the redshifts of our groups, L det is in the range ≈2−7×10 40 erg s −1 , and so half of the light from undetected LAEs is contributed by sources brighter than ≈1−5 × 10 39 , 2−7 × 10 38 , and 2−7 × 10 35 erg s −1 for the three slopes, respectively. There are no observational constraints on the Lyα LF at such faint luminosities.
The deepest UV LFs at z ≈ 3−6 are consistent with a powerlaw extending to magnitudes as faint as M AB ≈ −13 (Alavi et al. 2016;Bouwens et al. 2017;Livermore et al. 2017;Atek et al. 2018). Making the crude approximation that Lyα and UV emission are linearly related to the star formation rate and neglecting the effect of dust 11 , this value corresponds to a Lyα luminosity slightly fainter than 10 40 erg s −1 (e.g., Garel et al. 2015a). With the same assumptions, a Lyα luminosity of L Lyα ≈ 10 38 erg s −1 11 Here, we make the reasonable assumption that L Lyα [erg s −1 ] = 10 42 (SFR/M yr −1 ) and that very faint galaxies are likely to be metal-poor, such that dust attenuation is expected to be much less significant than for brighter objects (e.g., Garel et al. 2015a;Maseda et al. 2020).
corresponds to a UV magnitude as faint as M AB ≈ −8, or equivalently to a star formation rate (SFR) of approximately 10 −4 M yr −1 . We note that systems with such a low SFR are observed in the local Universe and that the UV LF at z = 0 keeps rising (at least) down to this level (Bothwell et al. 2011). Extrapolating these local constraints at high redshift is difficult but it is worth pointing out that Weisz et al. (2014) have been able to reconstruct the very faint-end of the UV LF at z = 3−5 down to M AB ≈ −5 using the star formation histories of Local Group dwarf galaxies, showing no evidence for a break of the LF.
Predictions from semi-analytic models typically do not go much fainter than L Lyα ≈ 10 40 erg s −1 or M AB ≈ −13, often because of the mass resolution of the parent N-body simulation (e.g., Garel et al. 2015a;Gurung-López et al. 2020). Alternative models, based on the Extended Press-Schechter formalism, predict that the UV LF at z ≈ 4 keeps rising until UV magnitudes of ≈−8, corresponding to galaxies hosted by halos at the atomic cooling limit (Yung et al. 2019). Beyond the atomic cooling limit, it is unclear whether the processes that regulate galaxy formation are strong enough to break the slope of the LF, especially at redshifts 3−4 when the fully ionized IGM easily resists the gravitational pull of low-mass dark matter halos (e.g., Okamoto et al. 2008).
To understand how our model is affected by the uncertainty on how faint the power law behavior of the Lyα LF extends, we show in Fig. 14 the results of our model when we ignore the contribution of galaxies fainter than L Lyα = 10 38 erg s −1 (red symbols) and 10 37 erg s −1 (green symbols). With these cuts and a slope α = −1.84, the expected SB due to undetected LAEs is in the range 30−70% of the detected emission and thus remains an important source of luminosity. Assuming a steeper slope (α = −1.94), we see that even when the LF is truncated at L Lyα = 10 38 erg s −1 , the signal from undetected LAEs may fully explain our observations for most groups. Alternatively, we clearly see from Fig. 14 that a slope shallower than α = −1.74 would only reproduce 10 to 30% of the diffuse Lyα flux, regardless of the value of the low luminosity cut. In this case, the contribution of undetected LAEs is unable to account for the diffuse emission, unless a very strong clustering is assumed (i.e., a very high ζ). Our model with a sharp cut of the LF below a given luminosity is of course unrealistic as one can expect a smooth transition between the power law behavior and the LF decline. Unfortunately we have no observational clue for the Lyα LF shape at such low luminosity, but one can point out that using a more realistic LF shape will simply increase the required steepness of the LF.

GALICS semi-analytical model
The arguments developed in the previous section demonstrate that the faint, undetected LAEs can be sufficiently numerous to contribute most or even all of the observed diffuse Lyα emission. We now use GALICS, a semi-analytic model of galaxy formation coupled to numerical Lyα radiation transfer models (Hatton et al. 2003;Garel et al. 2012Garel et al. , 2015a, to test this conclusion with a more elaborated model. In the following we give a brief description of the method used to generate mock Lyα narrowband images. A full description of the model is given in Appendix A. GALICS relies on a cosmological N-body simulation to follow the hierarchical growth of dark matter structures and on semi-analytic prescriptions to describe the physics of the baryonic component. We generate 100 mock lightcones that mimic the geometry and redshift range of the MUSE HUDF survey. ID 12195 δ = 6.5, z = 4.96. First column: distribution of simulated galaxies within the MOSAIC and MXDF field of view. Bright Lyα emitters with F Lyα > 10 −18 are marked with red symbols and those with 10 −19 < F Lyα < 10 −18 with orange symbols. In the MOSAIC, the limit for bright Lyα emitters is set to 10 −17.5 . Flux units are erg s −1 cm −2 . Low-luminosity sources extrapolated from the LF (see text) are shown in blue. Second column: corresponding noiseless Lyα narrowband image. Third column: simulated MXDF narrowband image with noise. Fourth column: composite of low and mid frequencies IUWT wavelets S/N components. The contours of the identified filament and compact source area are shown respectively in white and red. The MXDF field of view has an 82 arcsec diameter or 2.6 cMpc at z = 3.
We then reproduce the detection method described in Sect. 3.2 to identify LAE overdensities in the mock fields. In total, this yields 13253 mock groups, of which 2475 have δ ≥ 2 and N LAE ≥ 7.
In the previous section we have shown that a steep faint end slope (α −1.84) of the Lyα LF is required to explain the observed diffuse Lyα emission. The model requires integrating the LF down to very low halo luminosity: 10 37 −10 38 erg s −1 . However, the current dark matter mass resolution of the cosmological simulation used in GALICS is 2 × 10 9 M . This mass cut corresponds approximately to a Lyα luminosity of 10 40 −10 41 erg s −1 , depending on the redshift. To overcome this limitation, we produce a large number of low-luminosity ad hoc LAEs by extrapolating each group LF down to 10 37 erg s −1 . The Schechter LF model already presented in Sect. 5.2.1 with α = −1.84 is used. We note that from the recipes of star formation in GALICS presented in Appendix A.3, we predict that the faint-end slope of the Lyα LF should be about −1.8. These galaxies are then spatially placed to produce a given level of angular clustering (i.e., the factor ζ in Eq. (1)). A Gaussian kernel with a width of 0.1 arcmin is used to randomly place the ad hoc sources around GALICS LAE in the mock fields (see Appendix A for a detailed description).
By construction, the spatial extent of Lyα sources is not modeled in GALICS. In order to build a simulated Lyα narrowband image we proceed as follows: An empty image with the MXDF field of view is created. For each source in the selected GALICS group catalog, we derive a spatial profile using the galaxy-halo decomposition performed by Leclercq et al. (2017). Surface brightness is modeled as a sum of two circular, tow-dimensional exponential profiles. Statistical information and/or correlation with some measured properties is also available from the GALICS output, and are used to constrain the four model parameters. Details of the model can be found in Appendix B.
The simulated Lyα image is then convolved with the MXDF Moffat PSF model and Gaussian noise is added using the MXDF datacube variance values summed over the corresponding spectral window. To take into account the increased noise with respect to the propagated values, as measured in Sect. 4.3, a factor of two is applied to the noise standard deviation. In addition, a Gaussian spatial filter with a FWHM of one spaxel is performed on the noisy image to simulate the noise correlation present in the datacube. This image is finally processed using the same detection parameters as the one used in Sect. 4.1.
We present two examples of simulated groups in Fig. 15. These examples have been selected to be roughly representative of our observations, with overdensity values of 3 and 6, redshifts of 3 and 5 and filament Lyα extended flux of ≈2 × 10 −17 erg s −1 cm −2 . Figure 15 shows that the overall distribution of faint sources is well captured by the detection algorithm. One can see the power of the method by comparing the noisy Lyα narrowband image in the third column with the wavelet decomposition in the right panel. While it is hard to see more than the bright sources in the narrowband, the extended emission is clearly apparent in the low frequency IUWT images.
We note that the results obtained for these two show-case structures are illustrative of the success but also of the limitation of the detection method. In both cases, the algorithm appears sufficiently powerful to identify a diffuse emission structure when there is indeed one, even if it is hardly or not at all visible byeye in the data. However, the second emission is so faint that the estimated S/N (see Table 3) may only be marginally higher than the S/N that the algorithm would attribute to structures identified by mistake in the noise. In effect, if similar structures and A107, page 17 of 26 A&A 647, A107 (2021)  Notes. ID: GALICS group ID. z: redshift. δ: overdensity factor. S/N were obtained for some groups of the MXDF datacube, their associated P-values would be 0.3% and 40% respectively (see Fig. 8). Hence, in a noise-only situation, the S/N would be found to be larger than the S/N of the second show-case 40% of the time, showing that such a source is close to the detection limit. We note, however, that this reasoning assumes that the relation between P-values and S/N shown in Fig. 8 is also valid for the GALICS simulated narrowbands, which is probably pessimistic given the difference in noise properties between the real and the simulated data 12 . The existence of such a limit is indeed inherent to any detection method. These remarks help to interpret the results of Fig. 9, where many groups show diffuse emission that (if real) resembles the examples of the two show-cases of Fig. 15.
In Fig. 16 we show the derived filament parameters obtained when running the detection process on the 2475 GALICS groups with δ ≥ 2 and N LAE ≥ 7. The simulations present a large scatter in the filaments properties. A large part of it is due to fieldof-view effects. For statistical reasons we have used the large MOSAIC field to search for overdensities, but depending on the relative location of the filaments with respect to the smaller MXDF field, one might miss part of the structure. This is obvious in Fig. 15 where one can see that moving the MXDF location will eventually lead to no filament detection.
Compared to the GALICS simulation, the observations (black labels in Fig. 16) are located in the upper tail of the distribution of the simulated filament properties. This is particularly true for the diffuse Lyα flux and surface brightness (second and right panels in Fig. 16). As we can see from the figure, in most cases, the simulation is able to reproduce the observed Lyα surface brightness in diffuse areas (SB dif ) only in very overdense regions (δ > 10). Figure 15 indicates that the Lyα halos surrounding the numerous faint LAEs in each overdensity should overlap significantly, leading to a projected covering factor of unity over the entire area. We recall that in our previous study of the incidence rates of Lyα emission (Wisotzki et al. 2018) we found that without spatial clustering, a surface brightness level of 5 × 10 −20 cgs corresponds to an incidence rate on the order of 0.5 per unit redshift, comparable to that of high column density H i absorbers. In the overdense regions considered here this incidence rate presumably increases further. It is therefore plausible (but unfortunately untestable without bright background sources) that each line-of-sight through these overdensities would also reveal strong H i absorption. 12 This difference between the simulation and our observations could be explained by the idealized nature of the noise properties in the simulation. Although care was taken to properly scale the noise to reach the variance level observed in the data (see Sect. 4.3), the Gaussian noise injected in these simulations is different from the actual distribution of the noise. In particular, the imperfect continuum subtraction may leave behind observed narrowband Lyα image systematics, which are not present in the simulated narrowbands.
The median surface brightness estimated in the GALICS simulations is 1.7 × 10 −20 erg s −1 cm −2 arcsec −2 at δ < 5, a value significantly smaller than the minimum surface brightness in the high confidence groups (3.7 × 10 −20 erg s −1 cm −2 arcsec −2 ). Only 8% of the simulated groups have a surface brightness brighter than this value. This is a factor of three below the 23% (5/22) success rate of the observations which implies that the GAL-ICS model is producing a lower average Lyα surface brightness. The ad hoc Gaussian model we use to extrapolate GAL-ICS results and to distribute faint sources is likely responsible for part of this disagreement. Indeed, (brighter) galaxies are typically observed to have a power-law correlation function, steepening at small scale due to the one-halo term (e.g., Hildebrandt et al. 2009;Harikane et al. 2017). The Gaussian kernel we use smooths out the one-halo term at scales below ≈0.1 arcmin which results in spreading the luminosity of faint galaxies too widely.
We note also that our GALICS mock catalogs are based on many other approximations, including an extrapolation of the Lyα LF by three orders of magnitude and an empirical model of galaxy and halo decomposition. It is beyond the scope of the present paper to present a self-consistent physical model compatible with our observations. Nevertheless, despite these approximations, the analysis of the mock data confirms that faint LAEs cannot be ignored in the photon budget and that they can produce a significant fraction or even most of the observed diffuse emission.

Gravitational heating
Gravitational compression of gas during structure formation is a net heating term that may be dissipated through the emission of Lyα radiation. In the low-density IGM, the thermal state of the gas is determined by a competition between cooling and heating both from gravitational compression and from the UVB (e.g., Hui & Gnedin 1997). At values of the density contrast 13 typical of large-scale filaments (≈10), the temperature of the gas is mostly the result of the competition between photo-heating from the UVB and radiative cooling (through Lyα emission). From Rosdahl & Blaizot (2012, their Fig. 9), we see that the Lyα emissivity of gas in this regime (i.e., at densities <0.01 cm −3 ) is well described by a steep powerlaw ∝n 2.5 H . That the slope is steeper than n 2 H is due to the fact that the ionized fraction rises toward low densities. Thus, at low densities, the joint contribution of the UVB and gravitational compression to Lyα emission drops rapidly and is not likely to contribute significantly to the emission we observe.
Within dark matter halos, or in their immediate vicinity, part of the IGM has condensed into denser cold accretion streams (e.g., Kereš et al. 2005). These streams can produce Lyα  Fig. 15 and Table 3. emission by dissipating their gravitational energy (Fardal et al. 2001;Dijkstra & Loeb 2009;Faucher-Giguère et al. 2010;Rosdahl & Blaizot 2012). In this regime, gravitational heating largely dominates over fluorescence from the UVB (Rosdahl & Blaizot 2012, their Fig. 6). This mechanism has been suggested as a possible source of the bright Lyα extended emission observed in Lyα blobs (e.g., Haiman et al. 2000;Fardal et al. 2001;Nilsson et al. 2006;Rosdahl & Blaizot 2012). For example, in their study of the R0-1001 Lyα nebulae, Daddi et al. (2020) conclude that the gravitational energy associated with gas infall is the most likely source of power for the observed extended Lyα emission (1.3 × 10 44 erg s −1 ). The same mechanism should also occur in less massive galaxies when cold gas falls into the potential wells of their dark matter halos. This may explain part of the extended Lyα emission observed around most LAEs with MUSE (Wisotzki et al. 2016;Leclercq et al. 2017). Again, by construction, the diffuse Lyα emission we measure excludes the CGM of all LAEs detected in the field, and we can thus rule out that cooling radiation is responsible for the diffuse emission we report.

Source of diffuse Lyα emission
In the previous section, we discussed three possible sources of energy that could power the diffuse Lyα radiation we observe: (1) fluorescence from the UVB (2) Lyα emission from an abundant population of undetected small Lyα emitters, and (3) dissipation of gravitational heating. We find that Lyα fluorescence powered by the cosmic UV background can explain at most 28−34% of the observed signal at z ≈ 3, and less than 10% at z ≈ 4.5 (Sect. 5.1.1). Even if our estimate is relatively uncertain (for example we use the maximum value of one for the LLS covering fraction as in Cantalupo et al. 2005 and we neglect pumping from other lines that may enhance the signal by ≈20%, Furlanetto et al. 2005), these results suggest that this process is not responsible for the bulk of the emission we observe. Perhaps more importantly, we note that Lyα fluorescence from the UVB is mostly produced by relatively dense gas, with column densities typical of Lyman Limit Systems (LLS) or larger. This gas is mostly located in the CGM of galaxies (or within their dark matter halos, e.g., van der Voort et al. 2012) and its Lyα emission may thus be associated with galaxies in Lyα surveys, in the form of Lyα halos. Moreover, while increasing the intensity of the local UVB will increase the overall emissivity of this dense gas, it will also decrease the emissivity of lower-density gas (at n H < 0.05 cm −1 , see Rosdahl & Blaizot 2012, their Fig. 16), thus pushing the emission even closer to the galaxies. It is thus not clear whether fluorescence from the UVB produces a signal that is distinguishable from that produced by galaxies and their CGM.
Similarly, we have argued that gravitational heating will lead to dissipation through Lyα emission only in the densest parts of the IGM, in the CGM. This emission may contribute to the extended Lyα halos around LAEs, in addition to fluorescence from the UVB or local star formation, to scattered Lyα photons from galaxies, or to extended star formation (e.g., Mas-Ribas et al. 2017). For more diffuse gas, the heating source is a combination of UVB and gravitational heating, which cannot be disentangled, but which has been measured in simulations to produce extremely faint emission, decreasing rapidly with decreasing density (Rosdahl & Blaizot 2012). Thus, it appears that fluorescence or cooling radiation, while they may play a significant role in lighting up the CGM of galaxies, are not able to explain the levels of extended emission that we detect beyond this CGM.
We therefore find that the most likely explanation for the diffuse emission we observe is an abundant population of ultrafaint undetected LAEs (Sect. 5.2). This result is in line with the recent findings of Mitchell et al. (2021) who use highresolution radiation-hydrodynamical simulations to show that the very extended part of Lyα halos around galaxies is mostly due to undetected neighboring galaxies. Our conclusion relies on a number of assumptions: the faint-end slope of the Lyα LF must be steep enough (α −1.84), this power-law behavior must extend down to at least ∼10 38 −10 37 erg s −1 before turning over and the clustering of faint Lyα emitters must be favorable (filling factor ∼1/6). While these assumptions seem reasonable, there are today no observational constraints on the Lyα LF shape below 10 40.5 erg s −1 or on the angular clustering of extremely faint Lyα emitters. Our observations may provide a new original and indirect constraint on this population of extremely faint LAEs.
To appreciate the nature of the extremely faint objects that may contribute to the observed signal, we remind the reader that a single massive star of 40 M will produce of order 10 49 ionizing photons per second for ≈6 Myr (e.g., Geen et al. 2018). Assuming these photons are all processed to yield Lyα through case B recombination, this single star would produce ∼10 38 erg s −1 in the Lyα line. The detailed simulations of molecular clouds by Kimm et al. (2019) show a more complete picture. These authors predict that a cloud of total mass 10 5 M with 10 4 M in stars has a Lyα luminosity that evolves from 3 × 10 39 erg s −1 at the onset of star formation to 5 × 10 36 erg s −1 when the cloud is disrupted 6 Myr later. A cloud of the same mass, but which forms ten times less mass in stars (i.e., with stellar mass 10 3 M ) is found to have a Lyα luminosity evolving from 6 × 10 38 to 3 × 10 35 erg s −1 over 20 Myr. In both cases, the Lyα luminosity of the cloud drops faster than the production of ionizing photons by the stellar population because the disruption of the cloud allows ionizing photons to escape without being reprocessed into Lyα. Systems with Lyα luminosities as faint as 10 37 −10 38 erg s −1 thus require very few massive stars to produce such luminosities. While some of these objects may be explained by stronger star formation events seen at later stages (when the ionizing radiation has dropped and neutral gas has been blown away), or by very low Lyα escape fractions, it is likely that sources with L Lyα ∼ 10 37−38 erg s −1 are indeed extremely small systems, with stellar masses as small as a few thousand solar masses, which are seen at the very first moments of their formation. These could be nascent galaxies or even (compact) star clusters, which have been observed at similar implied continuum magnitudes at higher redshifts (e.g., Boylan-Kolchin 2017; Vanzella et al. 2018Vanzella et al. , 2021, although currently we cannot differentiate between the two possibilities. While assessing the baryonic content of these ultra-faint LAEs is highly uncertain, we do not expect that these objects make a significant contribution to the global stellar mass budget. By extrapolating the SFR-M relation at z ≈ 3−4 (e.g., Salmon et al. 2015) and assuming our fiducial LF slope α = −1.84, we estimate that LAEs with 10 38 < L Lyα < 10 42 erg s −1 should only contribute 10−20% to the cosmic stellar mass density. The existence of a large number of faint galaxies also has some implications for synthesis models of the UVB. One of the key ingredients of these models is the total emissivity of galaxies, which scales with the integral of the UV LF down to faint luminosities: 0.01 L (Haardt & Madau 2012) or M AB = −13 (Faucher-Giguère 2020). Extending these integrals down to M AB ∼ −8 as suggested above would increase the global UV emissivities of these models by ∼50% (Haardt & Madau 2012) and ∼20% (Faucher-Giguère 2020). While these models can probably absorb this difference by readjusting their free parameters to reproduce the same observed values of Γ HI , it is interesting to note that FG20's model requires values of the escape fractions of ionizing radiation as low as 1% (at z = 3), which is already in tension with observational results (Steidel et al. 2018) and numerical works (e.g., Rosdahl et al. 2018). Increasing the total emissivity of galaxies by extending the LF to the faint-end makes this situation yet more uncomfortable. If confirmed, the very faint population of LAEs that we discussed above may thus require more important adaptations of these models.
We note that our analysis assumes that the Lyα emission from the CGM is counted as part of each galaxy's luminosity. Indeed, while previous measurements of the Lyα LF were restricted to the Lyα flux from galaxies, and were thus missing a large part of the Lyα flux (70% on average, Leclercq et al. 2017), the latest measurements with MUSE carefully include the halo flux (e.g., Drake et al. 2017;Herenz et al. 2019). Given the difficulty of measuring the extended Lyα flux at large galactocentric distance, the Lyα flux in the CGM may still be underestimated. This uncertainty affects the requirement stated above on the shape the LF should have to reproduce our observations.
Including the CGM Lyα emission in each galaxy's luminosity makes sense in a scenario where this emission is due to scattered light from the central galaxy. In that case, the total Lyα luminosity better relates to the star formation rate. This is a strong assumption, however, which is still heavily debated in theoretical work (e.g., Mitchell et al. 2021;Byrohl et al. 2020). It is interesting that the other sources of energy (UVB or gravitational compression) are also expected to produce most of their Lyα emission in the CGM. In that respect, these different processes are not strictly additive. Indeed, it is likely that the signal we detect does include a contribution from these terms, and that this contribution is present mostly in the CGM of faint galaxies, which we account for in the LF regardless of its physical origin.
Finally, we note that our results do not confirm the conclusion of Elias et al. (2020) who have predicted that diffuse Lyα emission from cosmic web filaments at z ≈ 3 and restricted to the IGM, should be easily detected by MUSE in a 30 h deep exposure. Our measurements within the MXDF, a 5 times deeper exposure than the one foreseen in their analysis, confirm that their prediction, as already pointed out by the authors, was overly optimistic. Moreover, we show that the majority of the diffuse flux is due to the Lyα emission within the CGM of undetected galaxies and that only a small fraction can be coming from the IGM proper.

Summary and conclusions
We introduced and analyzed observations carried out with MUSE in the MXDF, a single, 140 h deep field located in the HUDF area, complemented by MUSE observations of the entire HUDF area at 10 h depth. This resulted in datacubes of exquisite sensitivity and a catalog of 1258 Lyα emitters covering redshifts 2.9 to 6.7. We analyzed this unique data set to (i) detect and quantify Lyα emitter overdensities and (ii) search and characterize diffuse Lyα emission within each detected overdensity. Our major findings are summarized below: Lyα emitters are strongly clustered in redshift space, with 30% of the 1258 Lyα emitters residing in 22 overdensities spanning redshifts 3.0 to 5.8. The overdensity, measured within volumes of 260 cMpc 3 (z = 3.0) to 415 cMpc 3 (z = 5.8), is on average 3.2 and reaches 5.0 for the densest groups at z = 4.5 and z = 4.7. LAEs in overdensities have a low average Lyα luminosity (10 41.5 erg s −1 ) and their mean number density is 0.055 ± 0.017 cMpc −3 .
SED photometric analysis of the 67% of the LAEs with an HST counterpart shows that the galaxy population in overdensities is mainly composed of low-mass (1.4 × 10 8 M ), young (0.3 Gyr) galaxies with high specific star formation rates (10 −8.5 yr −1 ). The majority of the remaining LAEs without an entry in the Rafelski et al. (2015) photometric catalog display very high Lyα equivalent widths and no visible HST counterpart. On average, these very faint (M UV ∼ −15) star forming galaxies must have stellar mass below 10 7 M . Overdensities are expected to be populated by dark matter halos with masses around 10 11.3 M , with the notable exception of the z = 3.07 group 2 for which we estimate a halo mass of 10 13.5 M .
A search for diffuse Lyα emission within the MXDF area with an original method based on multiscale undecimated isotropic wavelet transforms, resulted in the identification of 14 overdensities with extended Lyα emission. This extended Lyα emission arises from filamentary structures of cMpc size and covers an area of 0.4−1.1 cMpc 2 , corresponding to a significant fraction (10−20%) of the total MXDF area. Group 2 at z = 3.07 is the only group displaying a filament outside the MXDF field, in an extended area covered with MUSE at 10 h depth. This additional detection extends the MXDF filament length to 4.6 cMpc, and reveals a second filament crossing the field.
We use the mid and large spatial scale segmentation images resulting from the wavelet decomposition process to split filaments into areas corresponding to compact sources and diffuse A107, page 20 of 26 emission. We show that the compact source area can account for the total Lyα emission of identified galaxies, including the emission from their CGM. Using Monte Carlo simulations we show that five overdensities among the 14 with extended Lyα emission display an extended diffuse Lyα emission signal with a mean surface brightness of 5.1 ± 1.2 × 10 −20 erg s −1 cm −2 arcsec −2 and a high confidence level (P-value < 0.06). The Lyα luminosity from this diffuse area represents a large fraction of the total filament flux: 68 ± 6%.
We have investigated the potential impact of AGN and concluded that, except for groups 2 and 6 where we cannot rule out a boost in Lyα surface brightness, AGN are unlikely to be a significant contributor to the observed extended Lyα emission in the 14 groups. This is especially true for 4 of the 5 overdensities with high confidence diffuse Lyα emission.
At z ≈ 3, a maximum of 28% and 34% of the observed surface brightness of the diffuse emission can be explained by Lyα fluorescence powered by the cosmic UV background. At higher z, this fraction is reduced to below 10%. The measured diffuse Lyα surface brightness can be reproduced by a population of undetected ultra low-luminosity Lyα emitters, provided that the faint end of the Lyα LF is steep enough (α −1.84), that it extends to luminosities lower than 10 38 −10 37 erg s −1 and that the clustering of faint Lyα emitters is significant (filling factor <1/6).
Narrowband Lyα emitter wide field surveys, followed-up by spectroscopic multi-object observations, have produced a wealth of information on LAE clustering and the large-scale structure of galaxies at high z (e.g., Francis et al. 2004;Matsuda et al. 2005;Zheng et al. 2016;Ouchi et al. 2020). They cover a large volume: for example 200 × 200 × 80 Mpc 3 for the SILVER-RUSH survey (Ouchi et al. 2018;Shibuya et al. 2017), but at the expense of a bright limiting luminosity (typically ∼10 43 erg s −1 ) and sparse sampling (e.g., 179 LAEs for the Harikane et al. 2019 study at z = 5.7 and 6.6, that is 5.6 × 10 −5 Mpc −3 ). With its two orders of magnitude higher average number density of LAEs (e.g., 6.3 × 10 −3 Mpc −3 at z = 6), our study for the first time probes the large-scale structure at high z with a much finer sampling. Thanks to this zoomed view of the cosmic web, we can resolve the filamentary structure in unprecedented detail. This is very complementary to the classical wide field approach.
Using both a simple analytical LF model and the GALICS semi-analytical model, we show that our measurements imply that a large population of ultra low-luminosity Lyα emitters (<10 40 erg s −1 ) powers the diffuse Lyα emission within the filaments. The Lyα luminosity of this ultra faint population corresponds to star formation rates smaller than <10 −4 M yr −1 . An important consequence of these results is that it is unclear whether intergalactic (as opposed to circumgalactic) gas is even observable through its Lyα emission because the denser regions where this emission is most likely to be detected will be crowded with a high surface density of small Lyα emitting galaxies. Even if the study of IGM proper in emission could be problematic, understanding how star formation proceeds in these dwarf galaxies, down to what Lyα luminosity the LF extends, what their clustering properties are and what the relative importance is of the various processes that produce the Lyα emission within the CGM and IGM, will be very valuable to understand galaxy formation. Last -but not least -the population of faint Lyα emitters appears to be a good tracer of the cosmic web.
This first detection of the cosmic web structure in Lyα emission in typical filamentary environments, namely outside massive structures typical of web nodes, is a milestone in the long search for the cosmic web signature at high z. This has been possible because of the unprecedented faint surface brightness of 5 × 10 −20 erg s −1 cm −2 arcsec −2 achieved by 140 h MUSE observations on the VLT. However, because of the limited size of the MXDF, we cannot trace the full length of the filaments, which extend at least to the megaparsec (physical) scale as demonstrated in the case of the z = 3.07 filament (Fig. 11). Repeating this study on a larger field and different environments would be very valuable to increase the detection statistics and to provide better constraints on the filaments' physical parameters. This remains, nevertheless, a costly investment in telescope time. In the longer term, the BlueMUSE project (Richard et al. 2019) will allow us to probe the Lyα redshift range 2−4 with a larger field of view. The lower Lyα redshift of BlueMUSE will be very beneficial for such a study. With a wavelength range mostly free of bright sky lines, a reduced impact of redshift dimming, and access to the peak of the cosmic star formation history, Blue-MUSE should be able to perform similar studies in less telescope time and over larger fields of view.

A.1. Model description
The GALICS model (Hatton et al. 2003) presented in Garel et al. (2015a) is specifically designed to study the formation and evolution of galaxies in the high redshift Universe. It relies on a cosmological N-body simulations to follow the hierarchical growth of dark matter structures and on semi-analytic prescriptions to describe the physics of the baryonic component. The simulation was run in a box of 100 h −1 cMpc on a side and it assumes a standard cosmology that is consistent with the WMAP-5 results. The simulation contains 1024 3 dark matter particles with an individual mass of ≈8.5 × 10 7 M . halos are identified if they contain at least 20 dark matter particles which corresponds to a minimum halo mass of about 10 9 M .
The intrinsic Lyα emission from galaxies is estimated from the production rate of hydrogen-ionizing photons (estimated from the stellar spectral energy distributions) assuming case B recombination. To compute the observed Lyα properties of galaxies, GALICS is combined with the library of radiative transfer simulations of Schaerer et al. (2011) which predicts the escape fraction of Lyα photons through dusty galactic outflows (see Verhamme et al. 2008;Garel et al. 2012). The GALICS Lyα luminosities used in the analysis presented in Sect. 5.2.2 therefore correspond to the dust-attenuated luminosities that emerge from the CGM of galaxies. As shown in Garel et al. (2015a,b), the model was tuned to reproduce various fundamental observational constraints at z ≈ 3−7, including the UV and Lyα luminosity functions, the stellar mass functions and the SFR to stellar mass relation.
For the sake of the present study, we follow the procedure of Garel et al. (2015b) to generate 100 mock lightcones that mimic the geometry and redshift range of the MUSE HUDF survey (that is, a square field of ≈9 arcmin 2 and 3 ≤ z ≤ 6.7). We then reproduce the detection method described in Sect. 3.2 to identify LAE overdensities in the mock fields. In practice, we bin the redshift distribution in slices of 8 cMpc, count the number of detected LAEs using a LAE flux detection limit of 10 −18 erg s −1 cm −2 inside the MXDF and 10 −17.5 elsewhere, and compute the overdensity δ corresponding to each group. In total, this yields 13 253 mock groups, among which 2475 have δ ≥ 2 and N LAE ≥ 7.

A.2. Faint-end extrapolation of GALICS mocks with ad hoc LAEs
Due to the limited mass resolution of the cosmological simulation, dark matter halos less massive than 2 × 10 9 M are not detected in GALICS. As a consequence, the number density of LAEs can be under-predicted at the faint-end. Our LAE sample is estimated to be incomplete below a Lyα flux limit of F lim ≈ 2 × 10 −19 erg s −1 cm −2 (see Garel et al. 2015b).
In an attempt to correct for this effect, we add ad hoc LAEs in our mock fields (for simplicity, the MXDF is assumed to lie exactly at the center of the HUDF field). We do so by extrapolating the Lyα LF between L lim , the Lyα luminosity limit corresponding to F lim at the redshift of the group, and a minimum Lyα luminosity, L min . In practice, the Lyα luminosity limit varies from 1.6 × 10 40 erg s −1 at z = 3 to 1.1 × 10 41 erg s −1 at z = 6.7. The choice of L min is somewhat arbitrary because the Lyα LF is unconstrained in this luminosity range. These considerations are beyond the scope of the present study, therefore we assume a fixed value of L min = 10 37 erg s −1 , which is more than three orders of magnitude smaller than the faintest observed LAEs. Based on the canonical relation between Lyα, star formation rate and UV luminosity (e.g., Garel et al. 2015a), this value would translate into SFR ≈ 10 −5 M yr −1 and M UV ≈ −6.
Using the best-fit parameters for the Lyα LF at 3 < z < 6 of Herenz et al. (2019), logΦ = −2.71, logL = 42.66, α = −1.84, the number density of LAEs is given by the Schechter function, Φ(L), which can be expressed as Φ(L) ≈ Φ L L L α for L L . Then, the extrapolated number density between L min and L lim is given by n ext = L lim L min Φ(L)dL = Φ (α+1)L α+1 (L α+1 lim − L α+1 min ). Then, we compute the total number of ad hoc LAEs per group, N ext , as the product of n ext by the comoving volume of the group, V gp , and the group overdensity, δ. From there, we randomly draw N ext luminosities between L min and L lim according to a powerlaw distribution with a slope α.
As an example, we present the result of this procedure in Fig. A.1 which shows the luminosity distribution of LAEs in the mock GALICS group 12195. This group is located at z = 4.96 and has an overdensity δ = 6.5. The black histogram corresponds to the number of GALICS LAEs while the red curve shows the extrapolated distribution of ad hoc LAEs between L min and L lim . In this case, there are 361 true sources in the initial GALICS catalogue. As a result of the extrapolation between L min and L lim , 328 049 ad hoc sources are added to the catalog. We note that GALICS sources that are fainter than L lim are discarded from the final catalog such that we keep only true LAEs at L > L lim and only ad hoc LAEs at L ≤ L lim .

A.3. Expected faint-end slope index
While GALICS cannot reproduce the full population of very faint LAEs because of the mass resolution limit, galaxies with Lyα luminosities L Lyα = 10 37 −10 39 erg s −1 do exist in the current simulation (see Fig. A.1). These are mostly located in young halos at the mass resolution threshold that recently appeared in the simulation, while a smaller fraction (≈25%) correspond to satellites sitting in more massive halos. From scaling arguments, we expect that the faint-end slope index is on the order −1.8, which is consistent with the value assumed above.
The origin of this relation comes from the Kennicutt law for star formation used in GALICS (see Garel et al. 2012Garel et al. , 2015a which can be expressed as SFR ∝ M 1.4 gas /r 0.8 gal where M gas is the gas mass reservoir in the galaxy and r gal is the galaxy disk radius. Cooling operates on short timescales in low-mass halos at high redshift, making the baryonic gas available for star formation before supernova feedback, and the λ parameter that drives the conservation of angular momentum in DM halos has a narrow distribution that peaks around 0.05 (Mo et al. 1998). Thus, assuming M gas = f b M vir (where f b is the universal baryonic fraction), r gal = λR vir (where R vir is the halo virial radius) and R vir ∝ M 1/3 vir , the star formation rate is expected to scale with the halo virial mass as SFR ∝ M 1.13 vir . Linking the SFR with the Lyα luminosity and neglecting the effect of dust in such faint galaxies (see Sect. 5.2.1), we can write the Lyα luminosity as L Lyα ∝ SFR ∝ M 1.13 vir . With a low-mass function of halos φ(M)dM ∝ M −2 dM, (for ΛCDM), this relation gives a low luminosity LF: So, the choice of connecting an analytical low luminosity LF with a typical slope −1.8 to the GALICS Lyα LF at higher luminosity is consistent with the assumption on the SFR in the simulation and with expectations from first principles.

A.4. Spatial distribution of ad hoc LAEs
Once the fake Lyα luminosities have been computed, we then have to generate the spatial positions of our ad hoc sources. On the one hand, the redshifts are simply drawn randomly between the minimum and maximum redshifts of the group. On the other hand, the celestial coordinates of ad hoc sources are determined from the spatial clustering of GALICS sources within the MOSAIC. Here, we make the assumption that faint sources follow a clustering pattern similar to that of bright galaxies, rather than being purely randomly distributed. In practice, we start by making a projected map of GALICS objects within the MOSAIC and we apply a two-dimensional Gaussian smoothing to the image, assuming a standard deviation σ of 0.1 . Although this value is arbitrary, it is justified by the fact that 0.1 corresponds roughly to the virial radius of the typical dark matter halos in our MOSAIC groups in the redshift range of our survey (M vir ≈ 10 11 M ; see Sect. 3.3). Then, the smoothed image is normalized such that it can be used as a two-dimensional probability distribution function for the spatial sampling of ad hoc LAEs. For each object, we draw random sky coordinates and implement a rejection method to sample our two-dimensional distribution. We repeat this process until all ad hoc sources in the field, N ext , have been successfully sampled. Figures A.2 and A.3 illustrate the successive steps of our spatial sampling procedure applied to the same mock group as in the previous section. The initial projected map of GALICS LAEs within the MOSAIC is represented by the black dots in Fig. A.2. Figure A.3 depicts the corresponding two-dimensional probability distribution function obtained from the Gaussian smoothing. Finally, the red dots in Fig. A.2 show the resulting spatial distribution of ad hoc LAEs.
Similarly the r h halo scale lengths (in kpc) is derived using the computed linear regression of Halo flux (in erg s −1 cm −2 ) versus scale length (Leclercq et al. 2017, Fig. 8).
r h = a h log F Lyα + b h , with a h = 0.066 ± 0.017 and b h = 3.477 ± 0.264.
The surface brightness is truncated at the virial radius derived from the GALICS halo mass.
We note that for the low-luminosity sources below the mass cut of GALICS and extrapolated from the Lyα luminosity func-tion (see Sect. 5.2.2), we do not have UV magnitudes or halo masses. In those cases we use M UV = −15 and M h = 10 8 M to infer a value of the galaxy scale length and the virial radius. In practice these ultra faint galaxies are smaller than the MXDF PSF and can be considered as point sources.