The ALMA REBELS Survey: Average [CII] $158\,\rm{\mu m}$ sizes of Star-Forming Galaxies from $z\sim 7$ to $z\sim 4$

We present the average [CII] $158\,\rm{\mu m}$ emission line sizes of UV-bright star-forming galaxies at $z\sim7$. Our results are derived from a stacking analysis of [CII] $158\,\rm{\mu m}$ emission lines and dust continua observed by ALMA, taking advantage of the large program Reionization Era Bright Emission Line Survey (REBELS). We find that the average [CII] emission at $z\sim7$ has an effective radius $r_e$ of $2.2\pm0.2\,\rm{kpc}$. It is $\gtrsim2\times$ larger than the dust continuum and the rest-frame UV emission, in agreement with recently reported measurements for $z\lesssim6$ galaxies. Additionally, we compared the average [CII] size with $4<z<6$ galaxies observed by the ALMA Large Program to INvestigate [CII] at Early times (ALPINE). By analysing [CII] sizes of $4<z<6$ galaxies in two redshift bins, we find an average [CII] size of $r_{\rm e}=2.2\pm0.2\,\rm{kpc}$ and $r_{\rm e}=2.5\pm0.2\,\rm{kpc}$ for $z\sim5.5$ and $z\sim4.5$ galaxies, respectively. These measurements show that star-forming galaxies, on average, show no evolution in the size of the [CII] $158\,{\rm \mu m}$ emitting regions at redshift between $z\sim7$ and $z\sim4$. This finding suggest that the star-forming galaxies could be morphologically dominated by gas over a wide redshift range.


INTRODUCTION
Investigating star formation activity in the early Universe is key to understand galaxy formation and evolution. Thanks to deep galaxy surveys with the Hubble Space Telescope (HST) and large ground-based telescopes, it is now widely established that high-redshift galaxies (from z ∼ 11 to z ∼ 4) have been rapidly forming stars at an accelerating rate (e.g., Madau & Dickinson 2014), supported by high gas fractions (e.g., Dayal et al. 2014;Liu et al. 2019;Decarli et al. 2020;Dessauges-Zavadsky et al. 2020;Wang et al. 2022, and see Tacconi et al. 2020, for a review). Investigating the gas supply that fuels the star formation activity requires detailed studies of the spatial distribution of gas within and/or around galaxies. However, this is still poorly understood as detailed observations of the interstellar medium or circumgalactic medium at high redshift have been limited.
In recent years, the Atacama Large Millimeter/submillimeter Array (ALMA) made it possible to observe ISM properties of high-redshift galaxies in great detail. In particular, with its unprecedented sensitivity, ALMA provided us with extremely deep surveys of high-redshift galaxies (see Hodge & da Cunha 2020, for a review). These ALMA observations revealed that the spatial distributions of interstellar gas seen through the far-infrared (FIR) emission line [CII] 158 µm is more extended than the dust continuum and rest-UV emission (e.g., Fujimoto et al. 2019Fujimoto et al. , 2020Herrera-Camus et al. 2021). Previous studies have suggested that these extended gas reservoirs are ubiquitous in high-redshift (z > 5) star-forming galaxies, and are linked to outflow features (e.g., Gallerani et al. 2018;Ginolfi et al. 2020;Graziani et al. 2020;Pizzati et al. 2020). However, these features are not yet confirmed for z > 6 star-forming galaxies, and it is not clear if the extended gas properties systematically change as a function of redshift.
In this paper, we investigate the average size of the [CII] 158 µm emission line and dust continua of z ∼ 7 galaxies based on the on-going ALMA large program Reionization Era Bright Emission Line Survey (REBELS; Bouwens et al. 2021b). We compare the z ∼ 7 size measurements with observations of z ∼ 4 − 6 galaxies from the ALMA Large Program to INvestigate [CII] at Early times (ALPINE; Le Fèvre et al. 2020; Béthermin et al. 2020;Faisst et al. 2020) to investigate if the spatial distribution of the ISM between these two redshift ranges.
This paper is organized as follows: in §2 we describe our observations and the sample used in this study. In §3, we present our methodology for stacking and size measurements. §4 shows the results and discussion on the stacked [CII] emission and dust continuum. Throughout this paper, we assume a cosmology with (Ω m , Ω Λ , h) = (0.3, 0.7, 0.7), and the Chabrier (Chabrier 2003) initial mass function (IMF), where applicable. With these cosmological parameters, 1 arcsec corresponds to 6.28 pkpc and 5.23 pkpc at z = 5 and z = 7, respectively.

Sample and ALMA Observations
Our analysis of z ∼ 7 galaxies is based on observations of the [CII] 158 µm line from the ALMA large program REBELS (PID: 2019.1.01634.L). REBELS used spectral line scans to search for the [CII] 158 µm line in 36 galaxies and the [OIII] 88 µm line in 4 galaxies. In this study, we use the 34 completed observations from Cycle-7, targeting [CII] emission lines, in UV-selected galaxies from z = 6.5 to z = 9. These scans were carried out in band-5 or band-6 using compact configurations (C43-1 and C43-2), resulting in the typical synthesized beam full width at half maximum (FWHM) of ∼ 1.2 − 1.6 . We refer to Bouwens et al. (2021b), Schouws et al. (in prep.), and Inami et al. (2022) for a complete description of the survey, ALMA data processing, and dust continuum detections, respectively. In addition to REBELS, we include 8 additional z > 6.5 galaxies from pilot ALMA [CII] observations (PID: 2015.1.01111.S, 2018.1.00085.S, 2018.1.00236.S). These additional observations employ identical sample selection criteria, spectral scan strategy, and angular resolution to the REBELS survey (see Smit et al. 2018;Schouws et al. 2021Schouws et al. , 2022. In total, we consider 42 separate ALMA targets as part of this analysis (2 sources are in common between REBELS and the pilot programs).
Additionally, we complemented our sample with the ALMA survey targeting z ∼ 4.5 to z ∼ 6 galaxies (ALPINE survey: Le Fèvre et al. 2020), as a lower redshift comparison sample. The ALPINE survey targeted 118 UV-bright main-sequence galaxies, spanning a stellar mass range 8.4 < log (M * /M ) < 11.0 and UV magnitudes of −23.3 < M UV < −19.2 ). These galaxies are the ideal comparison sample at 4 < z < 6, as the ALPINE survey provides the largest and the most homogeneous data set of [CII] 158 µm emission lines and dust continua of 4 < z < 6 starforming galaxies. We refer to Le Fèvre et al. (2020), Béthermin et al. (2020), andFaisst et al. (2020) for a complete description of the survey objectives, the ALMA data processing, and the multiwavelength ancillary observations, respectively. The data are available publicly on the ALPINE website 1 .

ALMA Detections
For our [CII] emission stacking analysis at z ∼ 7, we use 28 individually detected [CII] emission lines (signal to noise ratio; SNR 5.2) from REBELS survey (23 galaxies) and pilot observations (5 galaxies). For the continuum stacking analysis at z ∼ 7, we include 16 individual dust continuum detections (at SNR > 3.3) from REBELS survey (14 galaxies) and pilot observations (2 galaxies). The detection threshold is sufficient to guarantee a ≥ 95% purity for the [CII] emission lines and dust continua (Inami et al. 2022;Schouws et al. in prep). The [CII] emission and dust continuum stacks are made based on individual detections of each emission. Specifically, we did not include galaxies that have [CII] detection but no continuum detection when constructing the continuum stack. This stacking strategy allows us to produce the highest SNR stacks as well as to avoid additional uncertainty of continuum size measurements, potentially arising from unknown continuum position (see §3.1 as well).
The detected [CII] lines have luminosities in the range between 8.1 < log (L [CII] /L ) < 9.2 with a median of log (L [CII] /L ) = 8.8 (Schouws et al. in prep.). The continuum luminosities are estimated using a median conversion factor of L IR = 14 +8 −5 ν L ν,158 µm based on the infrared SED derived by Sommovigo et al. (2022). The estimated dust continuum luminosities have a range 1 https://cesam.lam.fr/a2c2s/ of 11.5 < log(L IR /L ) < 12.2 with a median of log(L IR /L ) = 11.6 (Inami et al. 2022). While the average offset between detected [CII] emission and dust continuum is ∼ 0.35 for our z ∼ 7 galaxies (i.e., well within the synthesized beam size), two galaxies  show much larger spatial offsets ( 1 ). As these galaxies are potentially on-going mergers (Inami et al. 2022), we removed these two from our analysis, leaving 26 [CII] lines and 14 dust continuum detections.
For the 4 < z < 6 galaxies, we selected galaxies from the ALPINE public catalog , and included galaxies individually detected in [CII] for our analysis. The [CII] detection threshold of ALPINE (SNR> 3.5) corresponds to 95% purity, similar to our z ∼ 7 galaxies, ensuring that both observations have little spurious source contamination. Furthermore, to avoid on-going galaxy mergers contaminating our morphology analysis, we excluded [CII] emission lines showing merger events based on the morpho-kinematic classification by Romano et al. (2021). These selection provides 52 galaxies from ALPINE survey: 31 for z ∼ 4.5 and 21 for z ∼ 5.5.

Stacking ALMA Images
To investigate the average [CII] and dust continuum sizes, we made stacked images of the [CII] 158 µm emission lines, and also dust continua of z ∼ 7 galaxies. In the stacks, we included only individually detected [CII] emission lines and dust continua.
We note that stacking non-detected [CII] emission is difficult as [CII] non-detected galaxies only have photometric redshift. For dust continua, it would be still possible to include individually non-detected continuum. Nevertheless, we stacked only individually detected continua to make the highest SNR images, enabling a detailed study of the average [CII] and dust morphology. At the same time, this method helps to avoid possible systemic uncertainty of stacked sizes of dust continuum arising from unknown positions of the individually nondetected emissions. In particular, peak positions of detected [CII] and dust continua show ∼ 0.35 of offsets on average (Inami et al. 2022;Schouws et al., in prep). Thus, stacking non-detected continuum could introduce such systemic uncertainty on the measured small size (see 4.1).
For the stacks, we start with the 35 × 35 moment-0 maps that were made by integrating over the 2 σ velocity width of the [CII] emission lines after continuum subtraction, while the continuum images were made by removing channels that are in the 3 σ velocity widths of the detected [CII] emission lines. While the 2 σ velocity integration of [CII] could miss some of the high velocity, faint component arising from outflowing gas (e.g., Ginolfi et al. 2020), we decided the integration velocity width to focus on galaxy's most [CII]-bright component (i.e., host galaxies of outflows if they exist). This is inline with the previous study that studied "core" component of [CII] emission by selecting ±50 km/s velocity width of [CII] (Fujimoto et al. 2019). To avoid artifacts, these moment-0 maps were not deconvolved with the synthesized beam (i.e., without cleaning). After centering images to each of the peak fluxes, these maps were average-stacked using an inverse variance weighting, where the variance is measured using the background RMS of each map. We derived effective synthesized beams of the stacks by weighted-averaging all the dirty beams employing the same weights as for the moment-0 images.
We examined if the image-based stacking method systematically affects our results by comparing with the visibility-based stack. We performed this test using z ∼ 7 galaxies. We stacked the [CII] visibility data following the methods of Fujimoto et al. (2019). We then measured stacked [CII] sizes using the visibility-based fitting software UVMULTIFIT (Martí-Vidal et al. 2014). The resulting fits are shown in lower panels of Fig. 1. We found both stacked images are almost identical, and size measurements from both images agree well within < 7 %. Given that making stacked visibility data, especially the data concatenations (i.e., concat task in CASA), is time-expensive, and given both methods provide consistent results, we use the image based stacking in the following analysis. In our case, using image based stacking helps to produce a lager number of stacked images required in the bootstrap.
To check if a small fraction of extreme galaxies in our sample could bias our measurements, we performed a bootstrap analysis to estimate the uncertainties coming from both the noise and the sample variance. We made 1000 stacks using randomly selected N galaxies allowing overlaps, where N is the number of galaxies in the original stack. Throughout this paper the reported measurements are the median of the bootstrap resampling, and uncertainties are based on the 16th and the 84th percentiles.

Stacking the Rest-Frame UV Images
To examine sizes of z ∼ 7 galaxies in different wavelengths, we performed a stacking analysis of the restframe UV images using the method of Bowler et al. (2017). Although high resolution observations, such as using HST, is required to provide secure constraints, only a small subset of our sources (4 galaxies) have HST observations (Bowler et al. 2022). To provide tentative limits of rest-frame UV sizes, we used ground based observations; publicly available J-band images from the Ultra VISTA survey (McCracken et al. 2012) and the VIDEO survey (Jarvis et al. 2013); the resulting stack has a point spread function FWHM of ∼ 0.9 . Same as [CII] and continuum stack, we performed a bootstrap analysis to estimate the certainty of the rest-frame UV sizes. A possible caveat of only using ground base observations is discussed in §4.3.
For z < 6 galaxies, we used rest-UV size measurements from Fujimoto et al. (2020), which uses deep HST F160W images.

Size Measurements
We used GALFIT (Peng et al. 2010) to measure the beam-deconvolved effective radius (r e ) on the stacked maps of the [CII], dust continuum, and the rest-frame UV emission. For each GALFIT run, we assumed an exponential disk surface brightness profile, similar to previous studies (e.g., Fujimoto et al. 2019Fujimoto et al. , 2020, which assume that the [CII] surface brightness profile traces the gas distribution of galaxies (e.g., Bigiel & Blitz 2012). The exponential disk profile is in the form of ∝ exp(−r/r s ) where r s is the scale length of the exponential profile, and r s can be converted to the effective radius r e by r e = 1.678 r s (Peng et al. 2010). We also fixed the axial ratio to be 1 (i.e., circular exponential profiles) as stacks are expected to average over randomly oriented galaxies. Using the stacked synthesized images, we find that the [CII] emission line sizes are well resolved and constrained. Fig. 1 shows an example of our fitting results.

Dust and [CII] Sizes of z ∼ 7 Galaxies
We studied radial profiles of the stacked dust continuum and [CII] emission. We first measured surface brightness profiles of the emissions by calculating the median value within annuli of width 0.125 centred on the emission peaks. Errors are estimated using background standard deviation of the stacked images. We then measured surface brightness of the stacked synthesized beams using same method to check how well the stacked emission are resolved. By comparing normalized surface brightness, we compare relative extensions of continuum and [CII] emission (Fig. 2).
The radial profile of the stacked dust continuum is consistent with the stacked synthesized beam. This indicates that the dust continuum is, on-average, much smaller than the current spatial resolution of ∼ 1.3 . Using GALFIT, we find the dust continuum effective radius of r e−cont. = 1.14 ± 0.27 kpc (0.22 ± 0.06 arcsec), where r e−cont. is estimated by the median GALFIT outputs. We note, however, that the estimated r e−cont. is highly uncertain as continua are only barely resolved using the synthesized beam FWHM of ∼ 1.3 . We treat the estimated continuum size as a tentative measurement in the following, and we focus only on [CII] size comparisons with z ∼ 4 − 6 galaxies ( §4.2). Higher resolution observations are required to measure continuum sizes more accurately.
The stacked [CII] emission line is, on the other hand, well resolved and is clearly more extended than the synthesized beam (Fig. 2)  radius of r e = 2.21 +0.23 −0.19 kpc (0.42±0.04 arcsec) for z ∼ 7 galaxies, where r e and errors are estimated in the same manner as the continuum size.
To study rest-frame UV sizes of z ∼ 7 galaxies, we also used GALFIT to fit to the exponential disk profile to the stacked image (see §3.2). We found the stacked rest-frame UV image has an effective radius of r e−UV = 0.83 ± 0.16 kpc. The result shows that the rest-frame UV sizes of our sample are, on average, ∼ 3× smaller than the [CII] sizes. We note, however, that the stacked rest-frame UV image is only marginally resolved as only ground-based data are available (see Bowler et al. 2022 for HST observations of a subset of the REBELS targets). We report the rest-frame UV size as a tentative measurement. See §4.3 for discussions about possible systematic uncertainties of our rest-frame UV size measurement.
The extended [CII] emission and the compact dust continuum show that at z ∼ 7 the [CII] emitting gas is, on average, more extended than the star-forming re-  gions. These results are consistent with previous findings using star-forming galaxies at z ∼ 4 − 6 (Fujimoto et al. 2019(Fujimoto et al. , 2020Herrera-Camus et al. 2021). Combined with previous studies, our results show that the [CII] emitting gas is spatially more extended than actively star-forming regions over a wide redshift range.

Non-Evolution of [CII]
Sizes from z ∼ 7 to z ∼ 4.5 Similar to z ∼ 7 galaxies, the stacked [CII] emission of z ∼ 4.5 and z ∼ 5.5 galaxies from the ALPINE survey are well resolved as the radial profiles of the stacks are more extended than the stacked synthesized beams (FWHM of ∼ 1.1 ). To compare the shape of radial profiles more directly, we created stacked [CII] images having the same beam size. We convolved the stacked images at z ∼ 4.5 and z ∼ 5.5 with Gaussians that have σ 2 = σ 2 z∼7 − σ 2 z∼4/z∼5 , where σ z∼4/z∼5/z∼7 are the sigma of Gaussian fits to the respective beams of each stacked images. We note that the Gaussian convolutions resulted in similar beams for all redshift bins. We find that there is no noticeable difference in the radial pro- emission (red squares) shows little change between z ∼ 4.5 and z ∼ 7, while the rest-UV continuum, on average, is smaller at higher redshift (gray dots and line from Shibuya et al. 2015, and blue open circles from Fujimoto et al. 2020 and this work). The non-evolution of the [CII] size at z ∼ 4 − 7 remains present when we use a UV luminosity matched sample of galaxies at z ∼ 5 (yellow open square). This suggests that high-redshift galaxies might be morphologically dominated by gas, and that star formation activity occupies a progressively smaller fraction of the volume in galaxies towards the highest redshifts.
Additionally, to test a possible bias from the sample selection, we also stacked [CII] emission of 4 < z < 6 galaxies which have the same UV luminosity range as our z ∼ 7 galaxy sample. These UV luminosity matched sample have 39 galaxies at 4 < z < 6, and have UV luminosity of −23.0 < M UV < −21.4. The stack of the UV luminosity matched 4 < z < 6 sample shows r e = 2.17 +0.16 −0.18 kpc, meaning that there is no clear change in the measured size of [CII] from z ∼ 7 (Tab. 1). This suggests that the non-evolution we find is not strongly affected by sample selection in the different ALMA programs (Fig. 4).
Overall, the measurements show that the average [CII] emission sizes have little or no evolution between z ∼ 7 and z ∼ 4 (Fig. 4). Although further high-resolution observations of a large sample is required to confirm this trend (see §4.3), the current measurements suggest that the non-evolution of [CII] emission sizes could be in contrast with the previously found evolution of the rest-UV continuum sizes, scaling as ∝ 1/(1 + z) (e.g., Shibuya et al. 2015, however see also e.g., Curtis-Lake et al. 2016 −0.23 * † Median redshifts of each bin. † † −23.0 < MUV < −21.4 galaxies at 4 < z < 6 (see §4.2) * HST H-band size distribution in Fujimoto et al. (2020) for possible non-evolution). If confirmed, the differential evolution may suggest that high-redshift galaxies might be morphologically dominated by gas, and star formation activity occupies a progressively smaller fraction of the volume in galaxies towards the highest redshifts. Pizzati et al. (2020) showed that the extended [CII] emission can be explained by star formation driven outflow that have outflow velocity ∼ 170 km/s and high mass-loading factor of η = 3.1 defined asṀ = η SFR. Also, the same hydrodynamical models tuned the velocity of winds with ALPINE measurements ) predict a warm ISM of radius r ∼ 2.5 kpc (Graziani et al. 2020), in broad agreements with the results of the present work.
At z ∼ 5, the ALPINE survey detected the outflowing [CII] emission through a secondary broad line in the stacked [CII] spectrum, showing the velocity FWHM of v FWHM ∼ 500−700 km/s . The finding of outflowing [CII] emission is further supported by an individual galaxy study of a star-forming galaxy at z ∼ 5.54, which shows an outflow and extended [CII] emission in a high-resolution ( 0.4 ) ALMA observation (Herrera-Camus et al. 2021). Similar broad [CII] emission line is detected from the stacked [CII] 1D spectrum of the REBELS galaxies (Fudamoto et al. in prep). The feature agrees with an outflow scenario of the extended [CII] emission in the previous observations (e.g., Gallerani et al. 2018;Ginolfi et al. 2020) and the theoretical predictions (Pizzati et al. 2020). A detailed analysis of the outflow properties in comparison with theoretical models will be presented in a future work.

Potential Caveats
Our current analysis is based only on massive and highly star-forming galaxies that have [CII] emission lines and/or dust continuum detections. Ginolfi et al. (2020) reported that [CII] emission sizes change as a function of star formation rate. Similarly, rest-UV continuum observations show that the UV size strongly depends on the galaxies' UV luminosity (e.g., Shibuya et al. 2015;Bowler et al. 2017;Bouwens et al. 2021a). In particular, using high-resolution HST observations, Bowler et al. (2017) showed that the UV-brightest multicomponent z ∼ 7 galaxies (M UV −21.5 mag) have r e−UV ∼ 1−3 kpc, while single-component galaxies with M UV −21.5 mag have r e−UV 1 kpc. Our stacked rest-frame UV size (r e−UV = 0.83 ± 0.07) is consistent with the single-component galaxies in Bowler et al. (2017). However, the current spatial resolution of the J-band images (FWHM ∼ 0.9 ) do not allow us to investigate further details of the rest-frame UV morphologies of our sample. Our findings from the average size comparison between [CII] emission and rest-frame UV emission may not apply to individual galaxies, especially if multi-component galaxies exist in our sample. Higher resolution images of the rest-frame UV emission will be required to study this in detail.
While we combined two ALMA large programs, the samples in each redshift bin only contain ∼ 30 galaxies, and all of them use relatively low resolution observations limiting our analysis to a small dynamic range and uncertain continuum size measurements. Expanding the parameter space (e.g., higher angular resolutions and observations of lower mass galaxies) are required to confirm the gas size evolution of high-redshift galaxies. Especially, higher angular resolution observation (< 1 ) will be crucial to provide more complete morpho-kinematic classifications, and to avoid any possible uncertainties to size measurements by merging galaxies that we cannot find by ∼ 1 resolution (e.g., late-stage mergers).

CONCLUSIONS
In this paper, we presented a study of the average [CII] 158 µm line emission, dust continuum, and restframe UV sizes of star-forming galaxies at z ∼ 7 based on the on-going ALMA large program REBELS. We also estimate the average size of [CII] emission lines of 4 < z < 6 star-forming galaxies using same stacking method to study the [CII] size evolution as a function of redshift. We summarize our findings: (i) At z ∼ 7, the [CII] 158 µm emission line is spatially more extended than the dust continuum and the restframe UV emission. We found the effective radius to be r e = 2.21 +0.23 −0.19 kpc, 1.14±0.27 kpc, and 0.83±0.07 kpc for the [CII], dust continuum, and rest-frame UV emission, respectively. The 2× more extended [CII] emission tracing the gas is consistent with previous finding at z 6.
(ii) Comparing with the stacked [CII] emission sizes of galaxies from z ∼ 4 to z ∼ 7, we found little or no evolution of [CII] sizes in star-forming galaxies between z ∼ 4 and z ∼ 7. If confirmed with further observations, the constant [CII] size could be in contrast with the previously found UV size evolution, and suggests that the [CII] emitting gas dominates the morphologies of high-redshift star-forming galaxies while star formation might occupy a progressively smaller fraction of size in galaxies towards high redshifts.
Further confirming this study would require larger samples of galaxies observed with ALMA, in particular expanding the redshift range observed to constrain the evolution of the gas sizes. At the same time, higher resolution observations are essential to measure sizes more accurately, and to study detailed structures and morphologies of high-redshift star forming galaxies. LG and RS acknowledge support from the Amaldi Research Center funded by the MIUR program "Dipartimento di Eccellenza" (CUP:B81I18001170001).
AF acknowledges support from the ERC Advanced Grant INTER-STELLAR H2020/740120. Partial support from the Carl Friedrich von Siemens-Forschungspreis der Alexander von Humboldt-Stiftung Research Award is kindly acknowledged (AF). IDL acknowledges support from ERC starting grant 851622 DustOrigin. JW acknowledges support from the ERC Advanced Grant 695671, "QUENCH", and from the Fondation MERAC. GCJ acknowledges funding from the "FirstGalaxies" Advanced Grant from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No. 789056). The Cosmic Dawn Center (DAWN) is funded by the Danish National Research Foundation under grant No. 140.