An ultra-deep multi-band VLA survey of the faint radio sky (COSMOS-XS): New constraints on the cosmic star formation history

We make use of ultra-deep 3 GHz Karl G. Jansky Very Large Array observations of the COSMOS field from the multi-band COSMOS-XS survey to infer radio luminosity functions (LFs) of star-forming galaxies (SFGs). Using $\sim$1300 SFGs with redshifts out to $z\sim4.6$, and fixing the faint and bright end shape of the radio LF to the local values, we find a strong redshift trend that can be fitted by pure luminosity evolution with the luminosity parameter given by $\alpha_L \propto (3.40 \pm 0.11) - (0.48 \pm 0.06)z$. We then combine the ultra-deep COSMOS-XS data-set with the shallower VLA-COSMOS $\mathrm{3\,GHz}$ large project data-set over the wider COSMOS field in order to fit for joint density+luminosity evolution, finding evidence for significant density evolution. By comparing the radio LFs to the observed far-infrared (FIR) and ultraviolet (UV) LFs, we find evidence of a significant underestimation of the UV LF by $21.6\%\, \pm \, 14.3 \, \%$ at high redshift ($3.3\,<\,z\,<\,4.6$, integrated down to $0.03\,L^{\star}_{z=3}$). We derive the cosmic star formation rate density (SFRD) by integrating the fitted radio LFs and find that the SFRD rises up to $z\,\sim\,1.8$ and then declines more rapidly than previous radio-based estimates. A direct comparison between the radio SFRD and a recent UV-based SFRD, where we integrate both LFs down to a consistent limit ($0.038\,L^{\star}_{z=3}$), reveals that the discrepancy between the radio and UV LFs translates to a significant ($\sim$1 dex) discrepancy in the derived SFRD at $z>3$, even assuming the latest dust corrections and without accounting for optically dark sources.


INTRODUCTION
Over the past two decades, impressive progress has been made in constraining the star formation rate density (SFRD) over cosmic time using a multitude of star formation rate (SFR) tracers (e.g., review by Madau & Dickinson 2014), providing vital information for understanding galaxy evolution. There is a reasonable consensus regarding the shape of the SFRD in recent history (z < 2). However, above z ∼ 3, the differences in the SFRD still encompass very different predictions from galaxy evolution models (e.g., Gruppioni et al. 2015;Henriques et al. 2015;Lacey et al. 2016;Rowan-Robinson et al. 2016;Casey et al. 2018;Moster et al. 2018;Behroozi et al. 2019). An accurate mesurement of the evolution of the SFRD is thus vital for the understanding of galaxy evolution.
Several tracers can be used to trace the SFRD. In principle, ultraviolet (UV) light is the most direct tracer of SFR in dust free environments which originates mainly from massive stars. UV light thus directly traces young stellar populations and can be used to constrain the unobscured star formation out to very high redshifts (z 9; e.g., McLure et al. 2013;Bouwens et al. 2015;Bowler et al. 2015;Finkelstein et al. 2015;McLeod et al. 2015;Bouwens et al. 2016;Parsa et al. 2016;Mehta et al. 2017;Ono et al. 2018;Oesch et al. 2018;Bouwens et al. 2021). However, UV observations need significant and uncertain corrections for dust obscuration and are unable to detect the most extreme star-forming galaxies (SFGs) in which star formation is known to be enshrouded in dust (e.g., Smail et al. 1997;Lutz et al. 2011;Riechers et al. 2013;Casey et al. 2014a;Dudzevičiūtė et al. 2020). Therefore, knowledge on how the dust attenuation evolves with redshift is mandatory to study the redshift evolution of the SFRD, particularly as the cosmic epoch z 4 may be dominated by dust obscured star formation (Casey et al. 2018;Bouwens et al. 2020).
Dust, heated by young massive stars, re-emits the absorbed UV light at longer wavelengths and can thus be studied in the far-infrared (FIR) or sub-millimeter (submm) to trace the SFR. Current FIR observations are able to constrain the dust content and SFRD up to a redshift z < 6 (e.g., Rodighiero et al. 2010;Gruppioni et al. 2013;Rowan-Robinson et al. 2016;Koprowski et al. 2017;Dudzevičiūtė et al. 2020;Lim et al. 2020). However, the constraints beyond z 3 are uncertain as the measurement of the FIR LF becomes more challenging. Source confusion and blending limit the ability to detect faint objects in low resolution Herschel /SPIRE observations at z 3 − 4. Such observations are thus biased towards an unrepresentative population of bright sources. In addition, these observations can be significantly contaminated by active galactic nuclei (AGN) as these sources are more numerous at high redshift (Gruppioni et al. 2013;Symeonidis & Page 2021).
Ground-based sub-mm/mm continuum observations of dusty galaxies can help to overcome some of the problems in FIR observations (e.g., Chapman et al. 2005;Hodge et al. 2013;Swinbank et al. 2014;Dunlop et al. 2017;Dudzevičiūtė et al. 2020;Zavala et al. 2021). In particular, ground-based interferometic arrays (e.g., ALMA) offer high-resolution observations and hence do not suffer from source blending. Sub-mm surveys are also less susceptible to AGN contamination as they are predominantly sensitive to the cool-dust in the star-forming population at high redshift (Hodge & da Cunha 2020). In addition, the dust-unbiased tracer [CII] was recently used in several studies conducted with ALMA to study the SFR (Gruppioni et al. 2020;Khusanova et al. 2021;Loiacono et al. 2021). But even with these advantages, sub-mm observations are still impractical to carry out large surveys that would overcome cosmic variance, which can have a strong impact on any counting statistic (e.g., Moster et al. 2011;Simpson et al. 2019;Gruppioni et al. 2020;Loiacono et al. 2021), because of the small field of view. Cosmic variance in sub-mm can be overcome by combining a wide-field single dish observation with expensive interferometric follow-up observations (Simpson et al. 2020).
Radio continuum emission is also an end-product of the formation of the most massive stars. Synchrotron radiation originates from the shocks produced by the supernova explosions (e.g., Sadler et al. 1989;Condon 1992;Clemens et al. 2008;Tabatabaei et al. 2017). Radio emission triggered by star formation is empirically found to correlate well with the far-IR (FIR) emission of SFGs: the FIR-radio correlation. Radio-SFR calibrations most often rely on this empirical FIR-radio correlation, which appears to hold across more than five magnitudes in luminosity and persists out to high redshifts (e.g Helou et al. 1985;Yun et al. 2001;Bell 2003), albeit with ill-constrained redshift evolution (e.g., Sargent et al. 2010;Magnelli et al. 2015;Calistro Rivera et al. 2017a;Delhaize et al. 2017). However, there is some discussion whether the redshift evolution can be ascribed to selection biases Algera et al. 2020b;Smith et al. 2021;Molnár et al. 2021;Delvecchio et al. 2021). In addition, AGN activity will cause strong deviation from the local FIR-radio correlation (Molnár et al. 2018) as accreting super-massive black holes (SMBH) in AGN also accelerate the electrons that produce synchrotron emission.
Radio emission is a tracer of star formation which is, unlike UV, not attenuated by dust. In contrast to FIR observations, radio observations have a high spatial resolution and can cover larger areas of the sky than interferometric sub-mm observations with high angular resolution. Radio observations in the synchrotron regime (∼ GHz frequencies) therefore offer a unique opportunity to study the star formation history of the Universe (e.g., Seymour et al. 2008;Smolčić et al. 2009;Jarvis et al. 2015;Calistro Rivera et al. 2017b;Novak et al. 2017;Leslie et al. 2020;Matthews et al. 2021).
Besides being used to calibrate radio luminosity as a tracer of SFR, the FIR-radio correlation is also often used for the classification of galaxies. A sample used for constraining the SFRD should only consist of sources with radio emission originating from star formation. Therefore one would ideally quantify the emission coming from SF and AGN in all sources. It is, however, easier to simply remove sources that show an excess in radio emission compared to what is expected from the FIR-radio correlation (radio-excess AGN, e.g., Del Moro et al. 2013;Delvecchio et al. 2017;Algera et al. 2020a). Radio-loud AGN are easily removed by this method, as these sources show a large offset from the FIR-radio correlation. A major uncertainty is the ability to distinguish composite sources, which emit low-level AGN emission, from SFGs (e.g., Padovani et al. 2009;Bonzini et al. 2013).
Radio studies to-date have observed radio LFs but struggled to reach the knee of the LF (L ) at z > 1. Because these studies are most sensitive to the SFG population above the knee, the density and luminosity evolution parameters may become degenerate preventing a precise estimate of the knee location. The radio studies from Smolčić et al. (2009) and Novak et al. (2017) thus assumed pure luminosity evolution rather than luminosity and density evolution (Condon & Mitchell 1984) in order to fit the radio LF out to z ∼ 5. Recently, Malefahlo et al. (2022) used a Bayesian approach to reach below the 5σ detection limit of Novak et al. (2017) but only constrained pure luminosity evolution. Enia et al. (2022) used 1.4 GHz-selected sample to constrain the evolution of the radio LF up to z ∼ 3.5 by fitting a modified Schechter function (equivalent to fitting both luminosity and density evolution).
We have taken advantage of the upgraded capabilities of the Karl G. Jansky Very Large Array (VLA) to conduct an ultra-deep, matched-resolution survey in both X-and S-band (10 GHz and 3 GHz, van der Vlugt et al. 2021; hereafter Paper I). In Algera et al. (2020a) (hereafter Paper II), the radio catalogs obtained from Paper I were matched with the rich multi-wavelength data available in the COSMOS field (Scoville 2007) to distinguish between AGN and SFG. In this work, we use the 3 GHz star-forming sample to constrain the faint end of the LF with the faintest SFGs that can currently be probed at high redshift with radio surveys. We also leverage the combined power of the COSMOS-XS survey and the 3 GHz VLA-COSMOS Large Project , which covers a larger 2 deg 2 area to a shallower depth of σ ∼ 2.3 µJy beam −1 , in order to increase our dynamic range and constrain the form and evolution of the LF -and thus ultimately the dust-unbiased SFRDout to a redshift of z ∼ 4.6. This paper is organized as follows: in Section 2 we summarize the data and selection methods. In Section 3, we present the method of constraining the LFs with redshift. We compare our derived radio LFs to the literature in Section 4. In Section 5 we discuss possible biases that need to be taken into account in the derivation of the LF. In Section 6 we use the most appropriate LF to calculate the evolution of the cosmic star formation rate density. Finally, Section 7 summarizes and concludes this work. Throughout this paper, the spectral index, α, is defined as S ν ∝ ν α , where S ν is the source flux density, and ν is the observing frequency. We use a ΛCDM cosmology with parameters H 0 = 70km s −1 Mpc −1 , Ω m = 0.3, Ω Λ = 0.7 (Bennett et al. 2013). We assume a radio spec-tral index of −0.7 unless otherwise stated. We assume the Chabrier (2003) initial mass function (IMF) to calculate SFRs.

Radio data
The COSMOS-XS survey consists of two overlapping ultra-deep single VLA pointings in the COSMOS field at 3 and 10 GHz of, respectively, 90 and 100 h of observation time. Further details on these observations can be found in Paper I but a short summary of the survey follows. The 3 and 10 GHz observations reach a depth of 0.53 µJy beam −1 and 0.41 µJy beam −1 at their respective pointing centres. Both frequencies have a near-equal resolution of ∼ 2.0 (2.14 × 1.81 at 3 GHz and 2.33 ×2.01 at 10 GHz) which is large enough to avoid resolving out faint SF sources.
Details on how the source extraction was performed in both images can be found in Paper I and Paper II. Sources were identified by PyBDSF (Mohan & Rafferty 2015) in the 3 GHz image and resulted in identification of 1540 radio sources.

Counterparts
The counterpart matching method to cross-match the radio sources is fully described in Paper II and briefly summarized below. Counterparts of radio sources were found using a symmetric nearest neighbor algorithm. Counterparts were assigned within a given matching radius. This matching radius was determined through cross-matching with mock versions of the appropriate catalog containing the same sources with randomized sky coordinates.

Radio counterparts
The 10 and 3 GHz data were cross-matched using a matching radius of 0. 9, which yields 91 matches with a false match rate (FMR) of 0.7%. The radio sample was also matched to the VLA COSMOS 1.4 GHz catalog (Schinnerer et al. 2007) using a matching radius of 1. 2 (FMR 0.1%). This generated 185 matches, with 12 sources being detected at all three frequencies (1.4, 3 and 10 GHz).

Optical and near-infrared counterparts
As described in Paper II, the radio observations were complemented with near-UV to FIR-data from various multi-wavelength catalogs: i) the Super-deblended mid-to far-infrared catalog (Jin et al. 2018) containing photometry ranging from IRAC 3.6 µm to 20 cm (1.4 GHz) radio observations. Blended galaxies in lowresolution FIR images are partly disentangled using priors on sources positions from high resolution images and point spread function fitting; ii) the z ++ Y JHK s -selected catalog compiled by Laigle et al. (2016) (hereafter COS-MOS2015) and iii) the i-band selected catalog by Capak et al. (2007).
For each source, we searched for a counterpart in the Super-deblended catalog with a matching radius of 0. 9. To complement the Super-deblended matches with optical and near-IR photometry, we also matched with the COSMOS2015 catalog followed by the i-band catalog with matching radii of 0. 7 and 0. 9, respectively. Sources not in the Super-deblended catalog were matched with the COSMOS2015 catalog using a matching radius of 0. 7. Sources which still lacked a counterpart were matched with the the i-band selected catalog with a matching radius of 0. 9. A flowchart of the matching process can be found in Fig. 3 of Paper II. Overall, 70 sources (4.5%) did not have any optical and NIR counterparts. These sources are not included in the subsequent analysis. An analysis on the properties of these sources can be found in Section 5.3 of Paper II. 1470 sources could be matched to a counterpart in at least one multiwavelength catalog. Based on the matching radii used, we expect a false match rate of 3%, corresponding to ∼ 40 sources.
Spectroscopic redshifts were obtained from the COS-MOS master catalog (M. Salvato et al.; available internally in the COSMOS collaboration). A spectroscopic redshift with a quality factor Q f > 3 was available for 584 radio sources. If a source could be matched within 1. 4 to an X-ray source, the photometric redshift from the Chandra X-ray catalog was used (Civano et al. 2016). Otherwise photometric redshifts from the Super-deblended catalog were used. If a Super-deblended redshift is unavailable, we instead used the photometric redshift from COSMOS2015 or the i-band selected catalog, in that order. 1437 sources have a counterpart and a reliable redshift. 33 sources have no redshift information and are removed from the sample. Out to z ∼ 1, nearly two-thirds of our redshifts are spectroscopic. This fraction drops dramatically toward higher redshift ( Fig. 4 in Paper II shows the distribution of the photometric and spectroscopic redshift).
The accuracy of photometric redshifts is estimated by comparing the photometric and spectroscopic redshift of the 584 sources with a spectroscopic redshift. The median of this comparison is σ(z) = |z spec − z phot |/(1 + z spec ) = 0.008 at all redshifts. The catastrophic failure rate (σ(z) > 0.15) is found to be 4.8%.

Sample selection
To estimate the LF of SFGs, we need to select sources with their radio emission originating solely from star formation. As radio emission can also originate from accreting black holes, we thus need to remove sources that have their radio emission dominated by an AGN. We use the FIR-radio correlation to select the SFGs, where sources with their radio emission dominated by an AGN will be offset from the FIR-radio correlation. The method to remove AGN from the sample is fully described in Paper II and briefly summarized below.
The FIR-radio correlation is defined as the logarithmic ratio of a galaxy's total FIR-luminosity L FIR , measured between (rest-frame) 8-1000 µm, and its monochromatic radio luminosity at rest-frame 1.4 GHz (L 1.4 GHz , following e.g., Bell 2003;Magnelli et al. 2015;Delhaize et al. 2017;Calistro Rivera et al. 2017a): The factor 3.75 × 10 12 is the central frequency of the total-FIR continuum (8 − 1000 µm) in Hz and serves as the normalization. Each galaxy in the sample is fitted using the SED fitting code magphys (da Cunha et al. 2008(da Cunha et al. , 2015, and the total FIR-luminosites are obtained from the best-fitted SEDs.
Rest-frame 1.4 GHz luminosities are determined in Paper II using the measured spectral index for the required K-corrections if available. When only a single radio flux is available, a spectral index of α = −0.7 is assumed instead. The luminosities are then calculated through Here D L is the luminosity distance at redshift z and S 3 GHz is the observed flux density at 3 GHz. The luminosities calculated as a function of redshift are shown in Fig. 1.
In order to quantify the FIR-radio correlation and find outliers, we adopt the redshift and mass-dependent q TIR (M , z) determined by Delvecchio et al. (2021). In order to use this q TIR (M , z), we need to have a mass for the sample. We used the mass given by the COS-MOS2015 catalog for the sources that could be matched with this catalog. For sources without a mass, we used the derived mean mass per redshift bin, ranging from 10 10.18 M to 10 10.70 M . When the q TIR (M , z) of a source deviates more than 3σ from the relation from Delvecchio et al. (2021), it is defined as a radio-excess source, i.e., where σ = 0.22. Such a cut identifies 130 radio-excess sources in total. Recent studies suggest a different evolution, including even a non-evolving q TIR (z), may be more appropriate (Molnár et al. 2018;Smith et al. 2021;Molnár et al. 2021) and we test the effect of such an assumption in Section 5.1. An additional criterion to identify radio-excess sources is established in Paper II, as only 50% of our sample is detected in the far-infrared at ≥ 3σ. For Herschelundetected sources, a conservative FIR-luminosity at the 2σ level is calculated, assuming the FIR-radio correlation as determined by Delhaize et al. (2017). The calculated FIR-luminosity is compared with the empirically determined detection threshold of Herschel. Sources with a calculated FIR-luminosity above the threshold are then identified as "inverse radio-excess" AGN, as they should have been observed with Herschel if their radio emission originated solely from star formation. The additional criterion enables us to identify 62 "inverse radio-excess" sources, of which only 17 were not already identified by the threshold in Eq. 3. We thus find 147 radio-excess sources in total, leaving a total star-forming galaxy sample consisting of 1290 radio sources. The redshift distribution of the sample is shown in Fig. 1 8 . 8 Sources with z > 4.6 like AzTEC-3 are not included because there are too few to give meaningful constraints on the LF.  Novak et al. (2017) studied the SFRD using the VLA-COSMOS 3 GHz Large Project data. This project provided data over the entire 2 deg 2 COSMOS field allowing for the detection of typical SFGs (SFR 100M yr −1 ) out to z ∼ 1.5. The COSMOS-XS survey is ∼ 5 times deeper than the VLA-COSMOS 3 GHz Large Project and when we combine the VLA-COSMOS 3 GHz Large Project data set over the whole field with our deep COSMOS-XS pointing, we obtain a survey "weddingcake" with sufficient dynamic range to enable a meaningful measurement of the form and evolution of the LF.

VLA-COSMOS 3 GHz Large Project
The radio-excess diagnostics by Novak et al. (2017) and Paper II are similar, the overall number of radioexcess sources identified is very similar and the overlap between these two samples is substantial. However, there are a few differences that are addressed in the appendix of Paper II and will be summarized below. Firstly, Paper II used the improved FIR photometry from the Super-deblended catalog (Jin et al. 2018) with a more detailed deblending technique and photometry up to 1.2 mm. Secondly, Novak et al. (2017) used the method from Delvecchio et al. (2017), which uses log 10 (L 1.4 GHz /SFR IR ) to separate radio-excess sources from SF sources. The SFR IR is correlated with the L 1.4 GHz because the SFR IR is calculated with SED fitting from the L IR and owing to the FIR-radio correlation. Therefore log 10 (L 1.4 GHz /SFR IR ) is equal to q TIR up to a constant. Delvecchio et al. (2017) then define radio-excess sources when the log 10 (L 1.4 GHz /SFR IR ) of a source deviates by more than 3σ from the peak of the distribution as a function of redshift. Although this results in a small difference in the total number of radio-excess sources identified in both surveys, we decided to use a consistent criterion for radio-excess sources. We used our threshold which is the q TIR (M , z) determined by Delvecchio et al. (2021) minus 3σ, as described in Section 2.3, to select SFGs using the L 1.4 GHz , L IR,SF and M from the 3 GHz radio catalog . This results in a data-set with 5822 star-forming sources.

ANALYSES
The LF describes the volume density of galaxies as a function of their intrinsic luminosity. We first discuss the method of determining the rest-frame 1.4 GHz LF from the COSMOS-XS survey. We then show how the data can be fitted with a modified-Schechter function assuming different "fixed parameters". Finally, we will consider the addition of the VLA-COSMOS 3 GHz Large Project data to constrain the LF over a larger dynamic range.
3.1. Estimating the LF The radio LFs are derived using the 1/V max method (Schmidt 1968). In each redshift bin, we have computed the co-moving volume available to each source in that bin, defined as V max = V zmax − V zzmin , where z zmin is the lower boundary of the redshift bin and z max is the maximum redshift at which the source could be seen given the flux density limit of the sample. The maximum value of z max corresponds to the upper limit of the redshift bin. For each luminosity bin, the LF is then given by: where V max is the co-moving volume over which the ith galaxy could be observed, Ω is the observed area of 350 arcmin 2 , ∆ log 10 L is the size of the luminosity bin, and w i is the completeness correction factor of the ith galaxy. The parameter w i takes into account the observed area and sensitivity limit and mitigates completeness issues where f flux is the flux density completeness of our radio catalog, f res is a correction for resolution bias and f ctrpt is the fraction of sources, which we have obtained reliable non-radio counterparts, for ith galaxy with flux density S νi . o i (z) is the overdensity factor derived as discussed in Appendix A.
The completeness (f flux ) of the COSMOS-XS radio catalog is shown and tabulated in Paper I. The completeness is based on Monte Carlo simulations where mock sources were inserted and extracted from the image. These simulations take into account the effect of the primary beam and the non-uniform r.m.s.. To correct for the resolution bias, we take the values tabulated in Paper I. These resolution bias corrections (f res ) were calculated using the analytic method as used in Prandoni et al. (2001) assuming a radio size for faint sources. As discussed in Paper II, 6.7% of our radio sources were not assigned a counterpart. To correct for this incompleteness, we use the counterpart completeness (f ctrpt ) of the COSMOS-XS radio catalog which is shown as a function of flux density in Fig 5 from Paper II. The completeness in all bins is upwards of 90%, and no trend with radio flux density can be seen, indicating that the association of counterparts to our radio sources is not limited by the depth of the multi-wavelength photometry.
The error of the LF in each redshift and luminosity bin is calculated as in Marshall (1985): If there are ≤ 10 sources in a luminosity bin, the error is calculated using the tabulated values from Gehrels (1986); we take the tabulated upper and lower 84% confidence interval as σ N and calculate the upper and lower error on the LF as σ Φ (L, z) = Φ(L, z) × σ N . We take the average value of the upper and lower error as the final error on the sparsely populated bins.

Constraining the LF
In order to study the evolution of the radio LF, we derive a parametric estimate of the LF at different redshifts. We assume a modified-Schechter function (e.g Saunders et al. 1990;Smolčić et al. 2009;Gruppioni et al. 2013) for the shape of the LF: This function behaves as a power-law for L < L and as a Gaussian in log 10 L for L > L . Four parameters are used to describe the shape of the LF: L describes the position of the turnover of the distribution, Φ is used for the normalization and α and σ are used to fit, respectively, the faint and bright end of the distribution. Following previous work (e.g., Novak et al. 2017), the values of α and σ will be frozen at the values found for the local LF. In reality, α and σ may both change with redshift.
To find the parameters of the local LF, we used the Markov chain Monte Carlo (MCMC) algorithm, available Table 1 Parameter values describing the pure luminosity evolution fit and the density+luminosity evolution fits.  3.3. COSMOS-XS: Pure luminosity evolution When we fit the LF to the COSMOS-XS data, we only assume the position of the turnover (L , characteristic luminosity) to change with redshift. As we are not able to constrain both L and Φ for the higher redshift bins (z > 0.4), we choose to keep Φ at the local LF value. In reality, Φ may also change with redshift. We assume the shape of the LF to remain unchanged. This pure luminosity evolution can be expressed as

COSMOS-XS
where α L corresponds to the pure evolution parameter and Φ 0 is given in Eq. 7. The range of luminosities and redshifts for which the LFs were calculated were determined from the coverage of the luminosity-redshift plane shown in Fig. 1. All sources are distributed into equally spaced luminosity bins spanning the observed luminosity range. Bins which contain fewer than two sources are merged with the lower L consecutive bin. The gray solid lines in Fig. 1 show the redshift and luminosity bins used. The LFs calculated with the V max method are shown in Fig. 3 and tabulated in Table 4 in Appendix B. As noted in Section 3.1, the LFs are calculated using the 1.4 GHz rest-frame luminosity for easier comparison with previous studies. The black circles show the median luminosity of all sources in the corresponding luminosity bin. The horizontal error bars show the width of the bin. The vertical errors correspond to the errors calculated using Eq. 6. The data points were fitted with the analytical form from Eq. 8 using the MCMC algorithm assuming flat priors 9 . The redshift used in this expression is the median redshift of all the sources in the redshift bin. This value is given in the panels of Fig. 3. The best-fit values for α L are tabulated in Table 1 and the best-fit pure luminosity evolved function is shown with the red line in Fig. 3. Fig. 4 shows α L as a function of redshift. We find that α L remains roughly constant at z < 1.8, thereafter α L decreases with z.

COSMOS-XS + VLA-COSMOS 3 GHz samples:
Luminosity and density evolution Up until now, we have been considering pure luminosity evolution as we lacked sensitivity to constrain both the luminosity and density evolution. To constrain a LF with both luminosity and density evolution, we need both the contribution of the brightest and faintest sources to the LF, otherwise the two evolution parameters become degenerate. As shown in Fig. 3, the LF data from Novak et al. (2017) is more sensitive to the most luminous SFGs, whereas our data extend to the low-luminosity sources. Novak et al. (2017) found that significant density evolution could not be properly constrained by their observations alone as the faint end was not well-sampled. However, combining the two data-sets offers the possibility of jointly constraining the luminosity and density evolution.
To combine the VLA-COSMOS 3 GHz Large Project data with the COSMOS-XS survey, we select the SFGs from the VLA-COSMOS 3 GHz Large Project data, as discussed in Section 2.4 using the criterion described in Section 2.3. We then treat the two data-sets as two separate regions. This means we mask out the observed area of COSMOS-XS in the VLA-COSMOS 3 GHz Large Project. We then combine the two data-sets by means of the Avni & Bahcall (1980) method for coherent analysis of independent data-sets. The depth of the whole sample 10 −6 10 −5  is not constant throughout the region, as COSMOS-XS is ∼ 5 times deeper than the VLA-COSMOS 3 GHz Large Project. A source in the VLA-COSMOS 3 GHz Large Project area will therefore be detectable over the whole joint area, while fainter sources detected in COSMOS-XS are only detectable in the COSMOS-XS area. This means that the maximum volume of space (V max,i ) available for an object in the joint sample is defined by where V fld zmax (with fld = V, XS corresponding to VLA-COSMOS 3 GHz Large Project and COSMOS-XS, respectively) is the co-moving volume available to each source in that field, in a given redshift bin, while Ω fld is the area observed (1.673 deg 2 and 0.097 deg 2 for VLA-COSMOS 3 GHz Large Project and COSMOS-XS, respectively).
For each luminosity and redshift bin, the LF is given by Eq. 4 with o i = 1 for the VLA-COSMOS 3 GHz Large Project sources. The completeness correction w i for these sources consists of a completeness correction for the radio catalog and a counterpart completeness correction. These corrections are derived and described in, respectively, Smolčić et al. (2017) and Novak et al. (2017).
For comparison, we first fit the LF described by the analytical expression from Eq. 8 (i.e., pure luminosity evolution) to the joint COSMOS-XS + VLA-COSMOS 3 GHz data points using the method described in Section 3.2. The best-fit values for α L are tabulated in Table 1 and the best-fit pure luminosity evolved function is shown with the red line in Fig. 5. Fig. 4 shows α L as a function of redshift. At z > 1.8, we find that α L decreases similarly to what was found when pure luminosity evolution was fitted to the COSMOS-XS data-set alone.
With the larger dynamic range probed by the combination of the COSMOS-XS + VLA-COSMOS 3 GHz data-sets, we can now fit not only the position of the turnover with redshift, but also the normalization. This luminosity and density evolution can be described as Because the joint COSMOS-XS + VLA-COSMOS 3 GHz data-sets constrain both the high and low luminosity ends, the evolution parameters (α D and α L ) are less degenerate. The fit with luminosity and density evolution is shown in Fig. 5. Fig. 13 shows the two dimensional posterior probability distributions of α L and α D at each redshift. Fig. 4 shows the fitted parameters α L and α D as a function of redshift. We find that, when allowing for both luminosity and density evolution, α L decreases and α D increases to z ∼ 1, above which α L is constant with z while α D decreases.  Figure 4. The best-fit parameters for the luminosity functions as a function of redshift. The upper panel shows the evolution of the luminosity parameter α L . The lower panel shows the evolution of the density parameter α D . Open circles correspond to pure luminosity evolution for the COSMOS-XS survey. The red dashed line shows the fitted evolution to these points of the form α L + zβ. The green dashed line shows the simple pure luminosity evolution model described by Novak et al. (2017). The luminosity parameter shows a similar evolution as the evolution that Novak et al. (2017) described. The filled symbols correspond to the joint COSMOS-XS + VLA-COSMOS 3 GHz data. The red and blue filled symbols correspond to the best-fit parameters found for, respectively, the pure luminosity evolution and the luminosity and density evolution fitted to the joint sample. The density parameter shows a strong evolution while we observe little evolution in the luminosity parameter.

A COMPARISON WITH LUMINOSITY FUNCTIONS FROM THE LITERATURE
In the following section we compare our results to literature LFs derived from radio, FIR and UV observations. 4.1. Radio Fig. 3 and Fig. 5 show the determination of the radio LF of Smolčić et al. (2009) andNovak et al. (2017). Smolčić et al. (2009) derived the radio LF up to z < 1.3 using 340 galaxies from the VLA-COSMOS 1.4 GHz survey conducted over the 2 deg 2 COSMOS field (Schinnerer et al. 2007). Our data generally lies slightly above the data from Smolčić et al. (2009), which could be due to the different selection criteria. Specifically, Smolčić et al. (2009) only used rest-frame optical colors to select SFGs. However, at z > 0.6, the high luminosity bins (log 10 L 1.4 GHz 24W/Hz) from Smolčić et al. (2009) lie above our data. This could be due to contamination of their sample from AGN (Smolčić et al. 2009), as they used a different AGN selection method.   The VLA-COSMOS 3 GHz Large Project  was also conducted over the COSMOS field and yielded about four times more radio sources compared to the 1.4 GHz data of Schinnerer et al. (2007). This resulted in LFs up to z 5.7 using 5915 SFGs selected as described in Section 2.4. Overall, our radio LFs generally agree very well with those derived by Novak et al. (2017) based on this data-set. Because of the large field of view of the VLA-COSMOS 3 GHz Large Project, the LF data from Novak et al. (2017) is more sensitive to the most luminous SFGs, especially for z < 1.6. On the other hand, as Fig. 3 shows, the COSMOS-XS data extends towards lower luminosity and adds in almost every redshift bin two low-luminosity data points. Given the good agreement between the COSMOS-XS and VLA-COSMOS 3 GHz data-sets and the larger constraining power of the combination (Section 3.4), we use the radio LFs derived from the combined COSMOS-XS + VLA-COSMOS 3 GHz data-sets for the comparison with the LFs derived from the IR and UV in the following sections.

Far-infrared
If the FIR-radio correlation is linear (see Section 2.3), both FIR and radio LFs should follow each-other well. In Fig. 6, we compare our results with the FIR LFs from Gruppioni et al. (2013), Koprowski et al. (2017), Gruppioni et al. (2020) and Lim et al. (2020). To adapt their results with our redshift bins, we simply plot the value of Φ for which the mean z is within our redshift bin. Gruppioni et al. (2013) used the data-sets from the Herschel PEP Survey, in combination with the HerMES imaging data to derive the evolution of the FIR LFs up to z ∼ 4. Koprowski et al. (2017) found their total FIR LF measurements based on SCUBA-2 850 µm observations. Gruppioni et al. (2020) determined the total FIR LF using the non-target ALPINE sources observed with ALMA. These 56 sources were blindly detected at 860 µm within the fields of targeted galaxies of the ALPINE survey, and the total FIR was derived using SED fitting using semi-empirical templates. Finally, Lim et al. (2020) used a SCUBA-2 450 µm map in the COSMOS field covering a area of 300 arcmin 2 to construct the FIR LF.
To convert the total FIR LF given by Gruppioni et al. (2013), Gruppioni et al. (2020) and Lim et al. (2020) to a radio LF, we use the FIR-radio correlation as described in Eq. 1, with q TIR as the FIR-radio correlation from Delhaize et al. (2017) and rewritten as: To find the total FIR LF for Koprowski et al. (2017), we used the L 250µm /L FIR ratio given by the Micha lowski et al. (2010) template to convert the rest-frame 250 µm LF from the SCUBA-2 data to total FIR LF, which is then converted to a radio LF using Eq. 11. Similar to what Novak et al. (2017) found, our data agree well with these FIR surveys. However, at z > 2, our LFs are systematically lower than Gruppioni et al. (2013). We find that the more recent studies from Gruppioni et al. (2020) and Lim et al. (2020) are also higher than our data, although these data-sets are more uncertain due to the low number of sources per bin. The offset between our data and the studies from Gruppioni et al. (2013), Gruppioni et al. (2020) and Lim et al. (2020) at z > 2 may be partly attributed to the presence of AGN in the FIR selected sample. While we start from a radio sample that excludes AGNs, as described in Section 2, Gruppioni et al. (2013) and Gruppioni et al. (2020) derive the total FIR LF and thus include sources powered by AGN. In addition, the fraction of AGN is found to increase with redshift: Gruppioni et al. (2013) find that AGN largely dominate the FIR luminosity density at z 2.5. However, Gruppioni et al. (2020) find that the large majority of the SEDs of their sources are best fitted by star-forming or composite templates. In contrast, the Lim et al. (2020) study excludes sources identified as AGN based on their X-ray, mid-IR or radio-emission and finds a low AGN fraction compared to literature studies due to their deep observations. These probe a faint submm galaxies (SMGs) population which are less likely to host an AGN. The difference can thus not solely be explained by the presence of AGN. Some of the difference could therefore be due to the evolving q TIR (z) used in the conversion from FIR to radio. This will be discussed in more depth in Section 6.3. In addition, there are a lot of uncertainties in measuring the FIR luminosity from a few data points which is reinforced by discussion of Gruppioni & Pozzi (2019) on the study of Koprowski et al. (2017). We find that the Koprowski et al. (2017) LFs are systematically lower than the other FIR studies over the whole luminosity range and match our data at z > 2. Gruppioni & Pozzi (2019) explained the discrepancy to other FIR studies by attributing the difference to a choice of sub-mm SED and sample incompleteness.
In Fig. 6 we also compare our results with the observationally motivated sub-mm LF models from Casey et al. (2018). They developed an evolutionary model based on existing measurements of sub-mm number counts, redshift distributions, and multi-band flux information to study the shape and behavior of the FIR LF out to high redshift (z > 4). They considered two extreme cases: a dust-poor model, where the abundance of very dustrich dusty star-forming galaxies (DSFGs) relative to UVbright galaxies is low (< 10 % at z = 4), and a dust-rich model, where DSFGs dominate and contribute > 90 % to the star formation at z = 4. Both models include a "turning point" redshift at which the knee of the LF (L ) and the characteristic number density of the LF (Φ ) are transitioning in their evolution. For example, Φ might evolve like (1 + z) −2.8 up to z ∼ 1.5, and then gradually transition to (1+z) by a redshift of z ∼ 3.5. The turning point for the dust-poor and dust-rich model lies at, respectively, z = 2.1 and z = 1.8. Before this redshift the models use the same evolution parameters. Thereafter, they will evolve at different rates.
The dust-poor model is similar to the often adopted evolutionary scenario in the rest-frame UV literature. It represents the model that the dust-formation timescale is longer than the timescale for the formation of UV-bright galaxies. This means that DSFGS are rare at z > 4 in this model and only dominate the star formation at z ∼ 2. The dust-rich model is quite extreme and suggests that most star formation at high redshift was isolated to rare starbursts with very high SFR and that DSFGS would dominate the star formation at z > 1.5. Casey et al. (2018) showed that both models were consistent with the sub-mm data that existed at that time.
The predictions of the FIR LFs by Casey et al. (2018) shown in Fig. 6 are converted as discussed above. The converted LFs are consistent with our measurements at z < 2.5 and from z > 2.5 the models start to deviate from each other. At z > 2.5 the dust-rich model overpredicts our data, while the dust-poor model matches quite well, as also seen with the VLA-COSMOS 3 GHz Large Project data alone .
In summary, we find that our radio LFs are roughly consistent, within the error bars, with the FIR LFs. At z > 2 our LFs are systematically lower than Gruppioni et al. (2013), which we attribute at least partly due to AGN contamination. In addition, we find that the radio data is most consistent with the dust-poor model from Casey et al. (2018).

UV
It is also interesting to compare our radio LFs with previous UV LFs studies. The UV probes fainter sources at higher redshift and therefore offers a comparison sample complementary to that of FIR-based studies. In addition, the SFR calibrations from Kennicutt (1998) are self-consistent which means that all SFR tracers should result in roughly the same SFR estimate. The UV and radio both trace SF where radio is mostly sensitive to SFGs with a high SFR and UV is probing emission from SF not obscured by dust. The UV and radio LFs should thus follow each-other well if the UV can be fully corrected for dust extinction. In Fig. 7 we compare our results with the UV LFs from Mehta et al. (2017), Ono et al. (2018) and Bouwens et al. (2021). Mehta et al. (2017) used deep NUV imaging data as part of the Hubble Ultra-Violet Ultra Deep Field program to find the rest-frame 1500Å UV LF at z ∼ 1.7, 2.2 and 3.0. Ono et al. (2018) conducted the GOLDRUSH project with the optical images taken by the HSC-SSP which cover a large area of ∼ 100 deg 2 . The sample is constructed using the so-called drop-out technique. In this case the sample consisted of a total of ∼ 580, 000 Lyman break galaxies at z ∼ 4 − 7. The UV LF is then derived by combining the LFs from the HSC Subaru program with the LFs from the ultra-deep Hubble Space Telescope legacy surveys. Bouwens et al. (2021) derived UV LFs at z ∼ 2 − 10 based on the Hubble data from various legacy fields covering an area of ∼ 0.3deg 2 which contains > 24, 000 sources.
The conversion needed to compare LFs at radio and UV wavelengths is derived by Novak et al. (2017) following Kennicutt (1998): where M 1600,AB is rest-frame UV, A U V is the extinction given by 4.43 − 1.99β with β the UV spectral slope and q TIR is the FIR-radio correlation defined by Delhaize et al. (2017). To correct the UV data for dust extinction, we used the UV spectral slope β as tabulated as a function of magnitude by Bouwens et al. (2009)  Best fit parameters of the local luminosity function, as described in Eq. 7, fitted to the luminosity function from Bouwens et al. (2020) and to our radio luminosity function + the luminosity function from Bouwens et al. (2020) Mehta et al. (2017), in order to scale them from 1500Å to 1600Å. This was done by roughly defining the average β-slopes for the sources (β ∼ −1.7) and deriving the correction from there. To adapt the UV LF results to our redshift bins, we simply plot the value of Φ for which the mean z is within our redshift bin. We find that our LFs predict an excess of bright sources compared to Bouwens et al. (2021), Mehta et al. (2017) and Ono et al. (2018) at z ∼ 2.9 and z ∼ 3.6. The excess is especially striking at high luminosity at z ∼ 3.6, where the UV dust correction is most severe. Although the UV LFs have been corrected for dust extinction, they still seem to miss a part of the galaxies with dust obscured SF, as previously noted by To determine the UV underestimation of the obscured SFR suggested by our data, we fitted the local LF, as described in Eq. 7, to the dust corrected data from Bouwens et al. (2021) with all parameters unconstrained. The obtained best fit parameters are tabulated in Table 2. The fit is shown in Fig. 7. We then fitted the local LF in the same way to a combination of the radio data and the dust corrected data from Bouwens et al. (2021). We disregarded the three most luminous LF points from Bouwens et al. (2021) and took the radio data points instead. The obtained best fit parameters are tabulated in Table 2 and the fit is shown in Fig. 7. We then integrated the two fits from L z=3 , as defined by Bouwens et al. (2021), which corresponds to log 10 L 1.4 GHz = 21.14 WHz −1 to ∞, to find the difference between the two. We find the UV data presented in Bouwens et al. (2021) underestimate the integrated LF by 21.6 ± 14.2 % at these redshifts (3.3 < z < 4.6). We can interpret this estimate as a lower limit, as the mean redshift of the UV sample presented in the last panel of Fig. 7 is 3.8, slightly higher than the median redshift of the radio sample and we expect the UV LF to increase between z = 3.7 and z = 3.8.
We additionally note that the radio LFs displayed in Fig. 7 do not include any of the "optically dark" sources as described in Paper II. These 70 sources were not   The best-fit pure local LF for the data from Bouwens et al. (2021) at z ∼ 4.6 is shown with the dashed red line, while the solid red line shows the joint fit to the UV+radio data discussed in Section 4.3. The comparison of these two fits in the last panel suggests that at the highest redshifts probed here (3.3 < z < 4.6), the UV data underestimate the integrated LF by at least 21.6 ± 14.2 % where the LF is integrated down to L z=3 .
matched to a counterpart in any of the catalogs used in the counterpart matching as described in Section 2.2. Some of these sources could be spurious detections but most of these "optically dark" sources are expected to be real; we expect only ∼ 20 spurious sources. As discussed in Section 3.1, we do correct for the counterpart completeness with f ctrpt . This small correction as a function of flux density is done over the whole redshift range. However, the method used in Paper II, which finds 29 robust "optically dark" sources, shows that these sources are likely to have a redshift of z 4, similar to what was found in ALMA follow up of sources without an optical counterpart (e.g Dudzevičiūtė et al. 2020;Smail et al. 2021). The LF at z ∼ 4 including these "optically dark" sources will be higher than shown in Fig. 7. Different works have already identified "optically dark" sources, extreme SFGs heavily obscured by dust which lack an optical or near-IR counterpart, out to high redshift (z 5) (e.g Dannerbauer et al. 2008;Walter et al. 2012;Riechers et al. 2020). Wang et al. (2019) reported the results from the ALMA follow-up of a population of optically dark galaxies, and found a fraction of them to be massive dusty galaxies at high-redshift. They concluded that this population constitutes a significant fraction of the SFRD at z > 3. In addition, Talia et al. (2021) estimated that dust-obscured star-forming galaxies, found based on their emission at radio wavelengths and the lack of optical counterparts, have a contribution to the SFRD which can be as high as 40% of the previously known UV-SFRD. More recently, Enia et al. (2022) estimated the contribution of "optically dark" sources (H-dark galaxies) to the SFRD using 8 "optically dark" galaxies found at z ∼ 3, and finding they contribute 7 − 58% to the UV-based SFRD. The discrepancy between our radio LF and the UV LFs will thus also be greater with the inclusion of the "optically dark" sources. The derivation of the radio LF including these sources and implications that follow will be further discussed in a future paper.
In summary, we find our radio observations show an excess above the UV LFs for z > 2.9 even without including the "optically dark" sources. Although the UV LFs have been corrected for dust extinction, we estimate that they miss at least 21.6 ± 14.2 % of the star formation traced by the integrated radio LF.
4.4. Radio vs. FIR vs. UV As discussed above, the LF can be constrained by using different tracers: radio, FIR and UV. Each tracer may be affected by different biases. Radio observations can be contaminated by AGN. FIR and sub-mm observations lack, respectively, high resolution and large field of view observations. In addition, these bands have a limited sensitivity to galaxies at z > 3 and FIR observations can be significantly affected by AGN. UV observations need significant corrections for dust-obscuration and are unable to uncover the most extreme SFGs. By comparing all three tracers, we are able to find which bias is most impactful.
As discussed in Section 4.2, the radio data presented here are roughly in agreement with the FIR observations. In addition, Fig. 7 shows that our radio observations show an excess above the UV observations at z > 2.9. The current radio data thus confirm a discrepancy that exists between the FIR and UV data. This was already suggested by the work of Novak et al. (2017), and the new analysis of the combined data strengthens the evidence for the discrepancy and suggest an underestimation of the UV LF. Although radio and FIR observations share the risk of AGN contamination, these AGN are observed at a different wavelengths and thus have different methods of removal. Seeing that the radio and FIR observations are moderately consistent suggests that the most significant issue is with UV observations and their dust corrections The IRX-β relation (Meurer et al. 1999) is used in UV studies to attempt to correct for dust extinction. Their relation consists of the ratio of total FIR to UV luminosity (L FIR /L UV = IRX), a proxy for extinction, and the UV spectral slope (β), which depends on the column density along the line of sight that is attenuating the UV light. The relation is therefore sensitive to a range of ISM properties including dust geometries, dust-to-gas ratios, dust grain properties, and the spatial distribution of dust. Mancuso et al. (2016) have built an intrinsic SFR function and find that, even when corrected for dust absorption with the IRX-β relation, UV observations underestimate the intrinsic SFR for galaxies with a SFR > 30M yr −1 . Their result suggests a galaxy population at z 4 with large dust-obscured SFR of 100M yr −1 , the higher redshift counterparts to the dusty SF population observed by FIR observations at z 3. In addition, several studies have already shown that low redshift luminous infrared galaxies -so-called luminous and ultra-luminous galaxies (10 11 ≤ L IR < 10 13 L , LIRGs and ULIRGs) and high redshift dusty SFGs (L IR ≥ 10 12−13 L , DSFGs) -are offset from the nominal UV spectral slope (Goldader et al. 2002;Howell et al. 2010;Casey et al. 2014b;Bourne et al. 2017). Furthermore, Khusanova et al. (2020) recently concluded that the brightest Lyα emitters at z > 5 are very diverse and found that these galaxies have large scatter in observed β values. These studies show that UV observations miss a part of the galaxies with dust obscured SF and question the existing IRX-β relation as a method of dust correction.
In particular, we know the reliability of the IRX-β relation for high-redshift galaxies has several issues. Firstly, the shape of the FIR SED at high-redshift is poorly constrained due to a lack of sampling of the SED peak. This means that the FIR luminosity is derived from FIR SED models that are fitted at lower redshift. We also know that the dust temperature (T dust ) is crucial for the derivation of L IR , with an incorrectly assumed T dust changing the L IR by as much as an order of magnitude (e.g., Hodge & da Cunha 2020). Unfortunately, T dust is typically highly uncertain for lower luminosity highredshift galaxies and might depend on various galaxy properties (e.g., Chapman et al. 2003;Magnelli et al. 2014). In addition, the distribution of dust could be more patchy in high-redshift galaxies due to their turbulent nature. The UV slope is then dominated by the least obscured part of the galaxies, leading to an underprediction of the necessary correction (Faisst et al. 2017). These issues indicate that different dust corrections for bright and highly star-forming galaxies at high redshift are necessary, and we may thus need a different approach to correctly estimate dust corrections for these galaxies.

Evolution parameters
In this section, we compare the implied evolution of our LF parameters (Fig. 4) with previous multi-wavelength works from the literature. The FIR studies from Gruppioni et al. (2013), Koprowski et al. (2017) and Lim et al. (2020), and the UV study from Bouwens et al. (2021), describe the position of the turnover in the FIR and UV LF with L and M , respectively. The normalization of the LF is described by Φ . In these studies, L /M and Φ are simultaneously fitted. The FIR studies find the position of the turnover to evolve to higher luminosities. Bouwens et al. (2021) also find the characteristic luminosity M to increase to z ∼ 3, but thereafter they find it to remain relatively fixed over the redshift range z ∼ 3 − 8. This kind of evolution can also be seen in the study by Gruppioni et al. (2013), who describe the luminosity evolution of L up to z ∼ 1.85 as L ∝ (1 + z) 3.55±0.10 . Thereafter they find a somewhat slower evolution of L ∝ (1 + z) 1.62±0.51 up to z ∼ 4. The normalization of the LF was found to decrease with redshift by Gruppioni et al. (2013), Koprowski et al. (2017) and Bouwens et al. (2021). Lim et al. (2020) also found this once the faint-end slope α was fixed. Gruppioni et al. (2013) describe the normalization evolution again with a break. They find Φ to slowly decrease as Φ ∝ (1 + z) −0.57±0.22 up to z ∼ 1.1, followed by a quick decrease Φ ∝ (1 + z) −3.92±0.34 up to z ∼ 4.
As shown in Fig. 4, we find a strong evolution of the luminosity parameter, with a clear break at z ∼ 1, when we fit the COSMOS-XS survey and the combined data-sets for pure luminosity evolution. The evolution at z > 1 can roughly be fitted with (3.40 ± 0.11) − (0.48 ± 0.06) × z, shown with the red dashed line in Fig. 4. This agrees with the evolution that was found by Novak et al. (2017). The green dashed line in Fig. 4 shows the simple pure luminosity evolution model described by Novak et al. (2017), where they fit an evolution of (3.16 ± 0.2) − (0.32 ± 0.07) × z. In addition, we clearly see an increase of the position of the turnover, as seen before in UV, FIR, and radio studies.
When we instead fit simultaneously for luminosity and density evolution, we find a strong evolution of the density evolution parameter, whereas the evolution in the luminosity parameter remains relatively fixed. While the evolution of these parameters could be influenced by the need to fix the bright and faint end shapes of the distribution to the local values (Section 3.2), we note that the same caveat applies to all studies, regardless of the LF form fitted, that fix these parameters (e.g., Novak et al. 2017;Enia et al. 2022). We will see that this den-sity+luminosity evolution has an effect on the cosmic star formation history in Section 6.

POTENTIAL BIASES AND ADDITIONAL CAVEATS
Before we discuss the implications of our derived radio LFs for the cosmic star formation rate history, we first discuss the possible biases and additional caveats that need to be taken into account when deriving and interpreting the radio LF.

AGN contamination
A recent paper by Symeonidis & Page (2021) investigated the difference between the flatter high luminosity slope seen in the FIR LF compared to the UV LF. They constrained the AGN LF using X-ray observations and then converted the X-ray AGN LF to the FIR AGN LF. This AGN LF was then compared to the total FIR LF, which corresponds to emission from dust heated by stars and AGN. Symeonidis & Page (2021) claim that at z < 2.5, the high luminosity tail of the AGN FIR LF and total FIR LF converge, suggesting that the most FIR-luminous galaxies are AGN-powered. They conclude from this that the flatter high-luminosity slope seen in the FIR LF compared to that in the UV and optical can be attributed to the increasing fraction of AGNdominated galaxies with increasing total FIR luminosity. The AGN FIR LF and total FIR LF can be used to find the maximum value of SFR that would be believable if computed from the FIR luminosity. The range of maximum SFRs is between 1,000 and 4,000 M yr −1 at the peak of cosmic star formation history (1 < z < 3). When converted to radio luminosities, this gives a range of log 10 L 1.4 GHz ∼ 23.6 − 24.1. This suggests that the brightest bins in the radio LF in this redshift range could be contaminated with sources powered by AGN.
To assess to what extent our SFG sample is contaminated by AGN, we divide our data into four equally populated redshift bins and stack the X-ray images. The stacking is done with the online available tool CSTACK, which utilizes a mean-stacking method 10 . X-ray luminosities are calculated from the stacks assuming a power law spectrum with a slope of Γ = 1.4. Fig. 8 shows the X-ray luminosities as a function of FIR-luminosities, where the error-bars represent the bootstrapped spread on the median. The solid line shows the median trend found by Symeonidis et al. (2014), and the dashed line constitutes the 2σ scatter. We find little excess in the X-ray compared to the typical X-ray -star-formation 10 CSTACK was developed by Takamitsu Miyaji and can be found at http://cstack.ucsd.edu/. relations; the stacked data matches the trend from Symeonidis et al. (2014) within the scatter. Thus, we conclude that our star-forming sample is not substantially contaminated by AGN. In addition to our examination of the contamination of unidentified AGN in our radio LFs, we want to assess the influence of our SFG selection criteria. As discussed in Section 2.3, we used the following selection criterion to select SF sources: These sources do not show an excess in radio emission with respect to their FIR emission and are likely powered by SF. To assess the impact of this criterion, we also investigated using a non-evolving local value as defined by Bell (2003): where σ = 0.26 is the 1σ scatter in FIR-radio relation as found by Bell (2003). This resulted in a sample containing 187 fewer SFGs than the original sample. The number of sources excluded by this new criterion is thus not much larger than excluded by Eq. 13. This can also be seen in Fig. 9, where the difference between the original sample and the sample derived with the new criterion is small. The biggest impact can be seen in the last two redshift bins, where the high luminosity points differ slightly in the new sample. We thus conclude that the influence of our selection criterion used to select SFGs is small.

Radio spectral indices
Where possible, we calculate the spectral index of our sources using the other radio data available over the field.  We also show the LFs from Gruppioni et al. (2013), converted as described in Section 4.2 to radio LFs. The open symbols are converted assuming a constant FIR-radio correlation of 2.64 (Bell 2003) and the filled symbols are converted assuming an evolving FIR-radio correlation. The difference between the open and filled symbols shows the influence of the FIR-radio correlation on the comparison between the radio LF and FIR LF. The FIR-radio correlation remains the largest uncertainty in this comparison.
In particular, we find that 8% and 6% of our sources have a spectral index calculated with the 1.4 GHz data and the 10 GHz data, respectively. However, we were unable to measure the spectral index for 86% of our sample, as these sources were only detected at 3 GHz. Because our survey is ∼ 19 times deeper than the 1.4 GHz survey (σ ∼ 10 µJy beam −1 , Schinnerer et al. 2010), this induces a bias towards steeper spectra. Sources at the limit of our survey would need to have a spectral index of α = −3.9 to be observed in the 1.4 GHz survey. The median spectral index of sources matched at 1.4 GHz is α = −0.91. Because the 3 GHz survey is matched in depth with the 10 GHz survey, this bias does not exist for sources matched with the 10 GHz data. The median spectral index of these sources is α = −0.63. For the bulk of our sample, we therefore assume a standard spectral index of α = −0.7, which is consistent with that typically found for SFGs (Condon 1992;Kimball & Ivezić 2008;Murphy 2009;Smolčić et al. 2017). An uncertainty in the spectral index of ∆α = 0.1 would change L 1.4 GHz by 0.08 dex and 0.11 dex at z = 2 and z = 5, respectively (Novak et al. 2018). Assuming the canonical spectral index of α = −0.7 thus adds a large uncertainty to the measured LF. However, the observed spread in spectral indices is symmetric (σ ≈ 0.35; e.g., Kimball & Ivezić 2008;Smolčić et al. 2017) and therefore expected to cancel out statistically.
When we derive spectral indices, we assume the radio SED to be well described by a single power-law. However, there are processes which can alter the shape of the radio spectrum. For example, if thermal free-free emission substantially contributes to the radio emission (e.g., Tabatabaei et al. 2017;Tisanić et al. 2019) the spectrum will flatten and the single power-law will not hold. Recent work by Algera et al. (2021) using COSMOS-XS and COLDz on the radio spectra of high-redshift star-forming galaxies finds thermal fractions and synchrotron spectral indices typical of local star-forming galaxies, suggesting this is not a major source of uncertainty. Future deep, multi-frequency radio observations of larger samples will be necessary to study the radio SEDs of SFGs and understand the physical processes shaping them across cosmic redshift.

COSMIC STAR FORMATION RATE HISTORY
In this section, we first discuss how to calculate the SFRD from the radio LFs (Section 6.1). We then discuss how the form of the LF fitted and FIR-radio conversion can affect the results (Sections 6.2 and 6.3, respectively). Finally, we compare our results to literature results derived from radio, FIR and UV observations (Section 6.4).
6.1. Calculating the SFRD Having constructed the rest-frame 1.4 GHz LF, it is now possible to establish the redshift evolution of the star formation rate density. To convert luminosity density into a star formation rate density, we use the functional form given in Delvecchio et al. (2021): SFR(L 1.4 GHz ) = f IMF 10 −24 10 qTIR(z) L 1.4 GHz , (15) where SFR is the star formation rate in units of M /yr, f IMF is a factor accounting for the IMF (f IMF = 1 for a Chabrier IMF and f IMF = 1.7 for a Salpeter IMF) and L 1.4 GHz is the rest-frame 1.4 GHz luminosity in units of W Hz −1 . Novak et al. (2017) stresses that since low-mass stars do not contribute significantly to the total light of the galaxy, only the mass-to-light ratio is changed when the Chabrier IMF is used. Following Novak et al. (2017), we therefore used the Chabrier IMF.
The SFRD can then be estimated by taking the luminosity-weighted integral of the analytical form of the fitted LF and converting the luminosity in the integral to SFR. The integral of the SFRD can thus be written as: This integral gives the SFRD of a given epoch. Unless stated otherwise, all results show the SFRD obtained by integrating the fitted LF from 0.0 to → ∞. Our errors are estimated from the fitting parameters uncertainties through boostrapping whereby the uncertainties in q TIR (z) are taken into account. The quoted errors do not account for any systematic errors due to cosmic variance.
6.2. Luminosity evolution vs. density and luminosity evolution Fig. 10(a) shows the SFRD computed using the different fits to the radio LF discussed in Section 3.3 and Section 3.4, and using only the COSMOS-XS data compared to the combination of the COSMOS-XS + VLA-COSMOS 3 GHz data-sets. The combination enables us to fit not only pure luminosity evolution, but also to constrain the joint density+luminosity evolution.
When we compare all three results, we find that they all roughly agree up to z ∼ 1.8. At that point, both pure luminosity evolution model fits show an elevated SFRD at high redshift compared to the Madau & Dickinson (2014) curve (as also seen for the VLA-COSMOS 3 GHz data alone Novak et al. 2017). However, when we fit density+luminosity evolution to the combined data-sets, as favored by the data (Fig. 5), we instead find that the SFRD falls below the Madau & Dickinson (2014) curve at z 1.8. In the following sections, we will use the SFRD derived from the combined COSMOS-XS + VLA-COSMOS 3 GHz data-sets using density+luminosity evolution for the comparison with the SFRD derived from radio, FIR and UV observations. 6.3. FIR-radio conversion As Eq. 15 shows, the calibration of the SFR depends on q TIR (z) (see Section 2.3). Therefore not only is the FIR-radio correlation one of the uncertainties in the conversion from FIR luminosities to radio luminosities, as discussed in Section 4.2, but it is also one of the main uncertainties in the SFRD calculation. Current observations do not favor a constant q TIR (z) (Magnelli et al. 2015;Delhaize et al. 2017;Calistro Rivera et al. 2017a), although there is some discussion as to whether this evolution can be ascribed to AGN activity (Molnár et al. 2018) or selection biases such as the sampling of high mass galaxies at high redshift (Smith et al. 2021) and/or a redshift-dependent sampling of different parts of a nonlinear FIR/SFR relation (Molnár et al. 2021). To illus-  Figure 10. The impact of the fitted LF form and assumed FIR-radio correlation on the derived cosmic star formation rate density (SFRD). The left panel shows the effect of fitting pure luminosity evolution compared to the preferred model of density+luminosity evolution (assuming the same FIR-radio correlation). The right panel shows the SFRD obtained from the combined COSMOS-XS + VLA-COSMOS 3 GHZ data-sets (assuming den-sity+luminosity evolution) but for different assumed FIR-radio correlations. This gives an indication of the impact an assumed FIR-radio correlation has. The study of Madau & Dickinson (2014) is shown as a red line in both panels. In the remainder of the paper, we convert our radio LFs to SFRD using the Delvecchio et al. (2021) FIR-radio correlation. trate the impact of the assumed FIR-radio correlation on the comparison between the FIR LF and radio LF, we show in Fig. 9 the data from Gruppioni et al. (2013) converted using an evolving q TIR (z) (Eq. 11) and using the local constant value for the FIR-radio correlation: q TIR = 2.64 (Bell 2003). The difference between the two samples increases with redshift as expected due to the growing difference between the evolving and nonevolving q TIR . Fig. 9 also shows that if we assume q TIR (z) to be constant at the local value of 2.64, our radio LFs would match the FIR LFs better.
The impact of the q TIR (z) on the SFRD derived from the radio LFs is shown in Fig. 10(b).All three curves show density+luminosity evolution fitted to the combined COSMOS-XS + VLA-COSMOS 3 GHz sample, but with different values of q TIR (z). These values are derived by Delhaize et al. (2017), Algera et al. (2020b) and Delvecchio et al. (2021). Fig. 10(b) also shows the fit from Madau & Dickinson (2014) based on a collection of previously published UV and FIR data. The first FIRcorrelation we consider is from Delhaize et al. (2017). They constrained the evolution q TIR (z) using a doubly censored survival analysis on ∼ 10, 000 SFGs. To prevent from biases towards low and high average q TIR (z) measurements, these star-forming sources are jointlyselected in radio observations at 3 GHz and FIR observations. Assuming an average spectral index of −0.7, Delhaize et al. (2017) find that q TIR (z) decreases with redshift as: q TIR (z) = (2.88 ± 0.03) × (1 + z) −0.19±0.01 .
(17) Fig. 10(b) shows that this adopted q TIR (z) has a large impact on the evolution of the SFRD due to the steep evolution of q TIR (z) with redshift. The SFRD matches the fit from Madau & Dickinson (2014) at z < 1 well, after which there is an increasing and systematic discrepancy with redshift toward low implied SFRD values.
We next consider the FIR-radio relation from the recent study by Delvecchio et al. (2021). They calibrated q TIR (z) with a stacking analysis in the radio/FIR of a mass-selected sample of more than 400,000 SFGs in the COSMOS field. Delvecchio et al. (2021) find that q TIR (z) evolves primarily with M . A secondary, weaker dependence on redshift is also observed. The q TIR (M , z) is quantified as: In order to use Eq. 18 to derive the SFRD, we need to have a mass for the sample used to derive the radio LF as shown in Fig. 5. We used the mass given by the COS-MOS2015 catalog for the sources that could be matched with this catalog. We then derived the mean mass per redshift bin, ranging from 10 10.18 M to 10 10.70 M , to find q TIR (M , z). Fig. 10 shows that the SFRD derived with q TIR (M , z) described in Eq. 18 has a weaker dependence on redshift compared to Delhaize et al. (2017) and results in the best match with the compilation from Madau & Dickinson (2014) over the whole redshift range.
Lastly, we consider the FIR-radio correlation from Algera et al. (2020b), which focuses on a luminosity-limited sample SMGs. They find q TIR (z) = 2.20 ± 0.03, where they have addressed the incompleteness in the radio observations through a stacking analysis, and they find no evidence of evolution between 1.5 ≤ z ≤ 4.0. We note that the SMG sample is not well matched to the radio sample observed by the COSMOS-XS survey and the VLA-COSMOS 3 GHz Large Project. However, at z 2 the sample would be a better match as our sample traces SFGs with a high SFR. In addition, the derived q TIR (z) is free from some of the biases that come into play in q TIR (z) estimates from studies based on radio-selected samples. As expected, Fig. 10 shows that the SFRD calculated with q TIR (z) = 2.20 does not match the fit from Madau & Dickinson (2014) at z < 1, and we see that the SFRD values are in fact systematically low at all redshifts.
In summary, we show that the assumed FIR-radio relation has a significant impact on the derived SFRD. We find that the recent study by Delvecchio et al. (2021), which constitutes the first calibration of the FIR-radio correlation as a function of both stellar mass and redshift, shows the best agreement with the multi-wavelength compilation from Madau & Dickinson (2014), while the other two FIR-radio relations explored result in underpredicted SFRDs at high redshift. To be consistent with our sample selection described in Section 2.3 and given that the Delvecchio et al. (2021) result was also derived with a large unbiased sample using some of the deepest radio and FIR images available over the same field as our observations, we will use the FIR-radio correlation from Delvecchio et al. (2021) to convert our radio LFs to SFRD in the following.
6.4. Comparison with the literature In Fig. 11, we show the redshift evolution of the cosmic star formation density derived from this work compared with work in the literature derived at different wavelengths. The study of Madau & Dickinson (2014) is shown in all panels for ease of comparison. Below z < 2, our data agree well with the compilation from Madau & Dickinson (2014), although we observe some scatter in our SFRD estimates around z ∼ 0.9 which is likely due to cosmic variance (see Appendix A). Our SFRD turns over at z ∼ 1.8 and falls more rapidly than Madau & Dickinson (2014) out to high-redshift.
In Fig. 11(a) we show our derived SFRD compared to radio observations. Smolčić et al. (2009) derived the SFRD out to z = 1.3 from VLA imaging at 1.4 GHz. They assumed pure luminosity evolution for the local LF, a non-evolving FIR-radio correlation established by Bell (2003) and integrated over the full luminosity range. We find a good match with the SFRD derived by Smolčić et al. (2009) despite the different assumptions, though we note that they are only sensitive to lower redshifts (z 1) where the different assumptions have a smaller effect. Karim et al. (2011) performed stacking on mass selected galaxies and find a rise up to z ∼ 3. This rise is mainly due to the fact that they use a non-evolving FIR-radio correlation established by Bell (2003). Because this correlation does not evolve towards a lower q TIR value at high redshift, the resulting SFRD will be higher at higher z as discussed in Section 6.3. Finally, as discussed in Section 2.4, Novak et al. (2017) used the VLA-COSMOS 3 GHz Large Project to derive the SFRD up to z = 6. They assumed pure luminosity evolution and an evolving FIR-radio correlation q TIR (z) derived by Delhaize et al. (2017). Below z ∼ 2, our data agree well with the SFRD derived by Novak et al. (2017). However, our SFRD declines towards a much lower value than No-vak et al. (2017) found for z 2.2. This is due to the fitted density evolution, as discussed in Section 4.5, and can also be seen from Fig 10(b). The offset would be even larger if we would have used a similar FIR-radio correlation as Novak et al. (2017) used. This can be seen from Fig. 10(a), which shows the large offset between the SFRD we calculate assuming the FIR-radio correlation from Delhaize et al. (2017) compared to that of Delvecchio et al. (2021).
In Fig. 11(b), we compare our measurement of the SFRD to results from recent FIR observations from Gruppioni et al. (2020) and Lim et al. (2020). Gruppioni et al. (2020) derive the dust-obscured SFRD using the serendipitously detected sources in the ALPINE survey. In this case, the SFRD is derived from an extrapolation of the FIR LF, where the LF (shown in Fig. 6) is integrated down to log 10 (L IR /L ) = 8. Lim et al. (2020) derive the SFRD by integrating the FIR LF shown in Fig. 6 inferred using SCUBA-2 450 µm observations. They used the integration limits of L min = 0.03L and L max = 10 13.5 L . This integration is necessary in both studies since the data only constrains a small part of the LF, as can be seen in Fig. 6.
The first thing that stands out from Fig. 11(b) are the large error bars found in the studies of Gruppioni et al. (2020) and Lim et al. (2020), which are due to the small sample of sources considered in these studies. The observations by Lim et al. (2020) are still in agreement with our observations within these error margins. The next thing to note is that both FIR studies find a higher SFRD over the whole redshift range compared to the radio SFRD. This cannot be explained by the different integration limits, which should result in a higher radio SFRD as this is computed over the full luminosity range. However, Zavala et al. (2021) suggest that the SFRD found by Gruppioni et al. (2020) may be unusually high due to possible clustering of the serendipitous targets. Fig. 11(b) also shows results from recent sub-mm observations from Dudzevičiūtė et al. (2020) and Zavala et al. (2021). Dudzevičiūtė et al. (2020) used the AS2UDS sample from the ∼ 1deg 2 SCUBA-2 survey to derive the SFR from magphys fits. The SFRD is then found from an extrapolation of the 870µm flux limit of 3.6 mJy to 1 mJy (equivalent to L IR ≈ 10 12 L ) using the slope from the sub-millimeter counts in Hatsukade et al. (2018). Because of the area covered by this survey, it is likely to be much more representative than smaller volume studies such as Gruppioni et al. (2020). The curve from Dudzevičiūtė et al. (2020) does not match our radio SFRD at z 3 because this curve does not represent the total SFRD but shows the SMG contribution. The study by Dudzevičiūtė et al. (2020) demonstrates that the activity of SMGs peaks at z ∼ 3, suggesting that more massive and obscured galaxies are more active at earlier times. At z ∼ 4 the curve is roughly consistent with our data. Zavala et al. (2021) used the results from the MORA survey to search for DSFGs at 2 mm. The number counts from the survey are combined with the number counts at 1.2 and 3 mm to place constraints on the evolution of the FIR LF by making use of the evolution model of Casey et al. (2018). The SFRD is then found by integrating the best-fit FIR LF with an integration interval of log 10 (L IR /L ) = [9, 13.8]. The curve from Zavala et al. (2021) is consistent with our data de- This work: Φ and L evolution Bouwens+2020 Bouwens+2020: dust-corrected Figure 11. Cosmic star formation rate density (SFRD) history. Our SFRD history is shown with filled circles in all panels and is obtained from the combined COSMOS-XS + VLA-COSMOS 3 GHZ data-sets (assuming density+luminosity evolution). The study of Madau & Dickinson (2014) is shown as a red line in all panels. All data shown for comparison are indicated in the legend of each panel; see text for details. The comparison of radio-and UV-based SFRDs, integrated down to the same limit, in panel (c), shows that the UV-based SFRD from Bouwens et al. (2020) falls ∼ 1 dex below the radio SFRD at z 2.8. This suggests that the bulk of the star formation contributed by high-luminosity sources at high redshifts is not accounted for by dust corrections. spite the different integration limits.
In addition, Fig. 11(b) shows the results from the UV observations from Bouwens et al. (2020). They make use of ALMA observations for a sample of galaxies in the HUDF at 1.5 < z < 10 to provide improved constraints on the IRX-β relation. Bouwens et al. (2020) integrate their UV LFs from 0.03 × L to → ∞ in order to derive the SFRD. The radio SFRD matches the UV SFRD at z ∼ 3 and at z > 3 the UV SFRD rises above the radio SFRD. However. it is import to realize that the UV SFRD and radio SFRD compared in Fig. 11(b) are derived using different integration limits.
Differing integration limits will have a more substantial effect for the comparison between our radio-based study and UV-based studies, given the different shapes of the derived LFs evident in Fig. 7. To investigate the impact of the integration limits, in Fig. 11(c), we compare our radio-based results with the FIR-based study from Zavala et al. (2021) and the UV-based study from Bouwens et al. (2020), but now using a consistent integration limit across all studies except for the compilation from Madau & Dickinson (2014) which remains unchanged for ease of comparison. In particular, Bouwens et al. (2020) originally integrate their UV LFs from 0.03 × L to → ∞ in order to derive the SFRD. However, a fairer comparison of the radio-and UV-based SFRDs necessitates that they be integrated down to the same limit. As the radio observations do not reach the faint luminosities the UV observations reach, we have chosen the integration limit as the luminosity limit reached by the radio observations between z = 1.15 and z = 1.43, which is log 10 L 1.4 GHz = 22.7 WHz −1 . This corresponds to a luminosity limit of −21.5 mag (0.038 L z=3 ) for the UV LF. For the Zavala et al. (2021) FIR-based study, this corresponds to an FIR luminosity limit of of 10 11.15 L .
Below z < 2, Fig. 11(c) shows that the radio data now falls below the SFRD from Madau & Dickinson (2014), which can be explained by the limit that has been set for the integration of the radio LF. For z 2.2, the difference between our radio-based SFRD and Madau & Dickinson (2014) becomes similar to what was found in Fig. 11(a) and Fig. 11(b). The Zavala et al. (2021) curve appears similarly affected by the new integration limits, now falling below the Madau & Dickinson (2014) compilation, but continuing to follow the radio-based SFRD reasonably well.
In contrast, the UV-based SFRD from Bouwens et al. (2020) falls ∼ 1 dex below the radio SFRD at z 2.8. This result is very different from a naive comparison between the radio and UV-based SFRDs using their respective nominal integration limits, which would result in a reasonable match of the SFRDs even at the high redshift end. However, this can be explained by a "conspiracy" between the amount in which different sources contribute to the LFs at the different wavelengths. Observations in the UV find that the faint-end slope of the UV LF at high redshift is very steep, and the bulk of the luminosity at high redshift is thus coming from faint sources, as can be seen in Fig. 7. Our radio observations, on the other hand, suggest a much shallower faint-end slope, but they instead find a significant amount of star formation in high-luminosity sources that is missed by UV observations. When the integration limit is thus fixed to avoid extrapolating the radio LFs significantly below our detec-tion limit, we find a significant discrepancy in the resulting SFRDs. Fig. 11(c) shows that this is true even when UV observations are corrected for dust. In particular, Bouwens et al. (2020) make use of improved constraints on the IRX-β relation. This ∼ 1 dex discrepancy in the resulting SFRDs therefore suggests that the bulk of the star formation contributed by high-luminosity sources at high redshifts is not accounted for by dust corrections. As discussed in Section 4.3, including "optically dark" sources would only increase this discrepancy further.

SUMMARY & CONCLUSIONS
We studied a 3 GHz-selected sample of star-forming galaxies (SFGs) identified in the ultra-deep, multi-band COSMOS-XS survey. Using the deep multi-wavelength data available in the COSMOS field, and selecting SFGs based on the FIR-radio correlation, we identify ∼1300 SFGs with redshifts out to z ∼ 4.6. We use this SFG sample to study the evolution of the radio luminosity function (LF) with redshift.
We fit our radio LFs with a modified-Schechter function evolved in luminosity (pure luminosity evolution). By fixing the faint and bright end shape of the radio LFs to the local values, we find a strong trend in redshift for the luminosity parameter of α L ∝ (3.40 ± 0.11) − (0.48 ± 0.06)z. This evolution agrees with what has been reported in previous radio-based studies (e.g., Novak et al. 2017).
We then combined the ultra-deep COSMOS-XS dataset with the shallower VLA-COSMOS 3 GHz large project data-set over the wider COSMOS field. This combination increases our dynamic range to include both the faintest and brightest sources, allowing us to simultaneously constrain the density and luminosity evolution. Doing so, we find evidence for significant density evolution over the observed redshift range.
In order to compare our radio LFs to FIR LFs, we converted FIR luminosities to radio luminosities using a redshift-dependent FIR-radio correlation. We find that our LFs agree well with the FIR LFs at z < 2. At z > 2 our LFs are systematically lower than Gruppioni et al. (2013), which we attribute at least partly to AGN contamination. In addition, we find that the radio data is most consistent with the dust-poor model from Casey et al. (2018).
We also compare the radio LFs to the UV LFs of Mehta et al. (2017), Ono et al. (2018) and Bouwens et al. (2021), which are based on UV rest-frame observations of Lyman break galaxies. By fitting the local LF to the UV and UV+radio LFs and integrating down to 0.03 L z=3 , we find evidence for a significant underestimation of the UV LF by 21.6% ± 14.3 % at high redshift (3.3 < z < 4.6). We attribute this underestimation to appreciable star formation in highly dust-obscured galaxies.
We integrate the derived radio LFs with joint den-sity+luminosity evolution to determine the cosmic star formation rate density (SFRD). We find the radioderived SFRD to be consistent with the established behavior at low redshift, where it increases strongly with redshift out to z ∼ 1.8. The radio-based SFRD then declines more rapidly out to high-redshift than previous radio-based estimates, and is more consistent with the recent FIR-based estimated from Zavala et al. (2021).
In order to more directly compare the radio-based SFRD derived here with the recent UV-based SFRD from Bouwens et al. (2020), and to avoid extrapolating far below the radio detection limit, we integrate both LFs down to a consistent limit (0.038 L z=3 ). This direct comparison reveals that the discrepancy between the radio and UV LFs discussed above translates to an even more significant (∼1 dex) discrepancy between the radioand UV-based SFRDs at high redshifts (z > 3). This discrepancy persists even when the UV observations are corrected for dust obscuration assuming the latest dust corrections. The discrepancy would only increase with the inclusion of "optically dark" sources, which will be discussed further in a future paper. 2.5 < z < 3.0  Table 4 gives the luminosity functions of star-forming galaxies in the COSMOS-XS survey obtained with the V max method.   Fig. 13 shows the two dimensional posterior probability distributions of α L and α D for the density+luminosity evolution fitted to the combination of the COSMOS-XS survey and the VLA-COSMOS 3 GHz large project. The marginalized distributions for each parameter is shown independently in the histograms.