COSMOS2020: The galaxy stellar mass function The assembly and star formation cessation of galaxies at 0.2< z 7.5

Context. How galaxies form, assemble, and cease their star formation is a central question within the modern landscape of galaxy evolution studies. These processes are indelibly imprinted on the galaxy stellar mass function (SMF)


Introduction
The galaxy stellar mass function (SMF) is defined as the number density of galaxies Φ(M, z) in bins of stellar mass ∆M at each redshift z, and is a fundamental cosmological observable in the study of the statistical properties of galaxies.Understanding its shape and evolution ultimately informs us about the growth of the baryonic content of the Universe, and can help infer the star formation activity and the availability of cold molecular gas across cosmic time.Its integral over M is the galaxy stellar mass density (SMD) ρ * (z), which describes the cumulative mass assembled by a given epoch.
A coherent picture has emerged from these studies.For instance, the shape of the SMF for star-forming galaxies at z < 3 is well described by the empirically constructed Schechter function (Schechter 1976), which features an exponential downturn at a characteristic stellar mass M * , with a low-mass end that declines with a slope α; both seem to remain constant out to z ≈ 2 (Marchesini et al. 2009;Ilbert et al. 2013;Tomczak et al. 2014;Davidzon et al. 2017;Adams et al. 2021).This constancy in the shape of the SMF implies that star formation activity has acted consistently to transform baryons from cold gas into stars, and appears to have formed > 75% of the total stellar mass of the Universe in ≈ 10 Gyr.Only the normalization of the SMF, Φ * (z), of star-forming galaxies has been confidently shown to evolve with cosmic time, and its rapid evolution at earlier times points directly to enhanced rates of mass growth at earlier epochs (Popesso et al. 2023, and references therein).
Which physical processes are responsible for the behavior of the SMF, and the extent of their influence at different cosmic epochs, are not well understood even in the low redshift Universe (z < 1).Examples include feedback from supermassive black holes, supernovae, galaxy-galaxy mergers, stellar and gas dynamics, and gaseous inflows and outflows, all of which act on varying scales both physically and in time.As a result, significant effort has been undertaken to build detailed cosmological simulations to identify and understand the role of physical processes that underpin the observed shape and evolution of the SMF (e.g., Somerville et al. 2015;Furlong et al. 2015;Lagos et al. 2018;Laigle et al. 2019;Pillepich et al. 2018;Davé et al. 2019;Lovell et al. 2021).Hence, precise measurements of the SMF and SMD are key observables utilized by most (but not all, see e.g., Dubois et al. 2014) large-scale cosmological simulations to tune input parameters such that the SMF of the local Universe (z ≈ 0) is recovered.Comparisons of predicted and observed SMFs at earlier cosmic ages can point to the physical processes at play and/or suggest recipes in need of refinement (Torrey et al. 2014;Furlong et al. 2015).
Owing to the challenges associated with large-scale simulations, purely analytical, data-driven models have enjoyed great popularity with the introduction of the first large galaxy samples with photometric redshifts (photo-z).For example, Peng et al. (2010) constructed a phenomenological model to explain the bimodality of galaxy types (star-forming and quiescent) as seen from their mass functions, megaparsec-scale environment, and star formation activity.That is, as a consequence of two mechanisms driving star formation cessation (often referred to as "quenching"): massive galaxies cease forming stars irrespective of environment ("mass quenching"), and galaxies in dense environments cease forming stars irrespective of mass ("environmental quenching").Several discrete mechanisms have been proposed to explain the latter, such as ram pressure stripping (e.g., Gunn & Gott 1972), gas strangulation (e.g., Larson et al. 1980;Balogh et al. 2000), and dynamical heating of gas within haloes (e.g., Gabor & Davé 2015).Proposed mechanisms for the former must, by definition, involve secular processes.This includes star formation cessation due to structural changes, or heating and/or ejection of gas by central super-massive black holes for the most massive galaxies -radiative feedback from active galactic nuclei (AGN) -or by supernovae for less massive systems as they are more weakly self-gravitating (e.g., Shankar et al. 2006).Peng et al. (2010) hypothesized that while environmental effects reproduce the single Schechter shape observed for star-forming, blue galaxies in the local Universe, it is through a combination of both environmental and mass effects that the two-component Schechter shape observed for quiescent, red galaxies is reproduced.A wide variety of extensions have been applied to this model to directly incorporate other measurements such as the gas fraction (Bouché et al. 2010), and wholly new models continue to be developed (e.g., Peng et al. 2015;Belli et al. 2019;Suess et al. 2021;Varma et al. 2022).
The observed distributions of stellar mass and redshift are not in themselves the intrinsic, underlying SMF.Instead, they are bridged by statistical inferences between carefully selected samples and a vast, practically unobservable parent population from which a sample is taken.Insufficient control of biases and systematics obscure such inferences by creating unrepresentative samples, and hence weaken comparisons with simulation and analytical models (Marchesini et al. 2009;Fontanot et al. 2009).Great effort has been expended over the past 20 years to strip away a number of these biases, and their obscuring effects (Cole et al. 2001;Pozzetti et al. 2010;Marchesini et al. 2009;Bernardi et al. 2017;Davidzon et al. 2017;Leja et al. 2019a;Adams et al. 2021).Examples of these biases include sample incompleteness (including Malmquist biases and mass completeness), the so-called "cosmic" variance relating to sampling over-and under-dense regions of large scale structure, and Eddington bias which acts to overestimate the number density of the most massive galaxies (Malmquist 1920(Malmquist , 1922;;Eddington 1913).While many studies incorporate only poisson noise (Fontanot et al. 2009, although this is steadily improving with time), other notable uncertainties exist including sample variance and effective volume size, as well as uncertainties on photoz and stellar mass; the latter two items can introduce significant bias.Since the ultimate goal of any survey is to generalize the observed properties of a specific sample to that of the entire population, ignoring any of these important biases or uncertainties severely complicate this effort.
Studies of the SMF at increasingly higher redshift have been made possible due to advances in facilities and continued investment in deep, primarily photometric surveys of the distant Universe.Building on work by Songaila et al. (1994) and Glazebrook et al. (1995), Cowie et al. (1996) secured the first masscomplete samples at z ≈ 2−3, selected in the near-infrared (K) to directly constrain the Balmer continuum at ∼ 0.6 µm.Sampled nearer the stellar bulk at ∼ 1 − 2 µm, their mass-selected samples enabled a higher degree of mass completeness, and in turn provided a more representative view of the high-z Universe compared to optically selected forerunner surveys.More recently, precise mass estimates from similarly selected samples have been obtained by fitting observed spectral energy distributions (Tomczak et al. 2014;Martis et al. 2016;Straatman et al. 2015), from the ground (with VISTA, UKIRT) and space (HST/WFC3, Spitzer/IRAC).Although limited in area, samples recovered by HST have continued to pose new and significant challenges to existing paradigms by highlighting a series of stark changes in the shape of the SMF that indicates earlier galaxy populations were fundamentally different from those in the present day Universe.Additional studies at these early times (z ≳ 2) by degreescale surveys capable of finding the rarest sources have revealed the existence of surprisingly mature, massive quiescent galaxies whose mass has already been assembled by z ≈ 4 (e.g., Ilbert et al. 2013), challenging the assumed timeline typical for galaxy assembly (Steinhardt et al. 2016;Behroozi & Silk 2018).Limited numbers of them have been confirmed by detailed spectroscopic follow-up (e.g., Glazebrook et al. 2017;Schreiber et al. 2018a;Tanaka et al. 2019;Valentino et al. 2020;Forrest et al. 2020a,b).Likewise, several studies have placed constraints on the massive end of the SMF at the highest-redshifts (z > 6) from samples selected by their rest-frame UV luminosity (e.g., Song et al. 2016;Harikane et al. 2016), who utilized empirically calibrated L UV − M relations to convert directly measured UV luminosity into indirect estimates of stellar mass.
Although tantalizing, deriving mass estimates from UV luminosity involves significant uncertainties; only deep infrared photometry can directly measure the rest-frame stellar bulk (λ > 4000 Å, although ideally ∼ 1 − 2 µm) required to more directly assess stellar mass at these redshifts.Only recently have deep (K s ≈ 25) near-infrared photometric studies enabled the first mass-selected samples at z > 2, although with mass uncertain-ties increasing with redshift (Retzlaff et al. 2010;Fontana et al. 2014;Laigle et al. 2016).Importantly, photometric surveys of galaxies have -and are expected to remain -the primary means by which these measurements will be made; obtaining even elementary parameters for large (N > 100 000) samples with deep spectroscopy, while precise, is simply too expensive even when utilizing highly multiplexed instrumentation.For the time being, spectroscopy of the distant Universe will remain a follow-up exercise to strengthen larger photometrically selected samples.
Of these deep photometric surveys, those with large areas have a key advantage: they probe a wider range of environments (i.e., density) compared to narrower surveys.As such, they provide more representative samples that are significantly less likely to be affected by field-to-field cosmic variances (and, for the same reason, are more suited to find rare, massive galaxies).Although one may resort to combining disparate data sets into a single analysis to combat this (e.g., Moster et al. 2013;Henriques et al. 2015;Volonteri & Reines 2016;McLeod et al. 2021), the unknown systematics between survey selection functions and depths make the interpretation of these joined samples more uncertain.
Stitching together surveys of low-and high-z samples carries similar concerns.A lack of uniform selection owing to different survey areas, detection bands, and photo-z determination strategies (i.e., dropout selection) can likewise complicate interpretations arising from such composite samples (e.g., Leja et al. 2019a;McLeod et al. 2021;Adams et al. 2021;Santini et al. 2022).Worse, these photometric samples may have been processed with different spectral energy distribution (SED) fitting codes that assume different templates, emission line recipes, and dust attenuation laws.Differing choices of cosmology and initial mass function, while reversible, nonetheless add complexity.
Accurate estimates of the galaxy stellar mass function at increasingly higher redshifts requires complementary deep observations from near-infrared selected samples to ensure both mass completeness and data reliability.Currently the widest field with near-infrared coverage of the requisite depth to probe large samples of z ≫ 3 galaxies is the Cosmic Evolution Survey (COS-MOS; Scoville et al. 2007).This work takes advantage of the latest NIR-selected catalog of the COSMOS field, COSMOS2020 (Weaver et al. 2022).We adopt the "The Farmer" catalog whose photometry was consistently measured across all bands by fitting galaxy profiles while taking into account PSF variations, depth, and source crowding.From the resulting photo-z and stellar masses we construct a consistently measured SMF from z = 0.2 − 7.5, identify quiescent and star-forming systems, and study the build-up and assembly of stellar mass over 10 Gyr of cosmic history.This includes a detailed study of key moments in the development of galaxy populations from the Era of Reionization (z > 6) to the peak of star formation activity at Cosmic Noon (z ∼ 2), up to the rich bimodality of star-forming and quiescent galaxies observed at more recent times (z ≲ 1).
This paper is organized as follows.Section 2 introduces the dataset chosen for this analysis from which samples are drawn and possible uncertainties discussed in Section 3. Section 4 provides a brief overview of the Schechter function.Results are presented in Section 5, including the presentation and fitting of the total, star-forming, and quiescent mass functions.These are compared to literature measurements in Section 6, whereupon further discussion is had with regards to galaxy assembly, star formation cessation, and connections to dark matter halos.This work concludes in Section 7.

Data: COSMOS2020
We utilize the most recent release of the COSMOS catalog, COSMOS2020 (Weaver et al. 2022), comprised of ∼ 1 000 000 galaxies selected from a izY JHK s coadd with near-infrared depths approaching 26 AB ensuring a mass-selected sample complete down to 10 9 M ⊙ at z ≈ 3.This sample is complemented by extensive supporting photometry ranging from the UV to 8 µm over a total area of 2 deg 2 , making it the widest nearinfrared-selected multiwavelength catalog to these depths.This is made possible by the latest release of the UltraVISTA survey (DR4; McCracken et al. 2012;Moneti et al. 2019), which is the longest running near-infrared survey to date, and complemented by even deeper optical grizy data provided by Subaru's Hyper Suprime-Cam instrument (HSC PDR2; Aihara et al. 2019).Flanking this core imaging are u band measurements made by the CLAUDS survey from the Canada-France-Hawaii Telescope (Sawicki et al. 2019) and deep Spitzer/IRAC imaging in all four channels, reprocessed to include nearly every exposure taken in COSMOS for use in this catalog (Euclid Collaboration et al. 2022, and references therein).As in the previous COSMOS catalog by Laigle et al. (2016), several intermediate and narrow bands from both Subaru/Suprime-Cam and VISTA (Milvang-Jensen et al. 2013) are used to provide precise determinations of photometric redshifts.
In addition to the comprehensive set of data, photometry is performed by two methods to produce two separate catalogs.The catalog called Classic contains aperture photometry extracted with Source Extractor (Bertin & Arnouts 1996) for the optical/near-infrared and with IRACLEAN (Hsieh et al. 2012) for the infrared (as in Laigle et al.).The catalog called The Farmer contains profile-fitting photometry extracted with The Farmer (Weaver et al. 2023), which uses The Tractor (Lang et al. 2016) to construct and apply parametric models to estimate source fluxes and does so consistently across more than 30 bands, including all four IRAC channels.Each photometric catalog is paired with photometric redshift and physical parameter estimates from both LePhare (Arnouts et al. 2002;Ilbert et al. 2006) and EAzY (Brammer et al. 2008) resulting in four measurements of photo-z and M per source.The combination of comprehensive deep images, independent photometry techniques, and independent SED fitting enables an unparalleled control of systematics and uncertainties, and hence further insight when inferring the underlying true nature of galaxy populations.See Weaver et al. (2022) for details.

Selection function
Galaxies in COSMOS2020 are selected from a near-infrared izY JHK s CHI-MEAN coadded detection image (Szalay et al. 1999;Bertin 2010).The deepest band is i, with a 3σ sensitivity limit at ∼27 mag, and the shallowest is K s at ∼25 mag1 .The reddest selection band, K s delivers mass-selected samples to z ≲ 4.5.Furthermore, the comparably deeper K s imaging (by 0.5 − 0.8 mag relative to COSMOS2015) translates directly into a higher degree of completeness where low-M samples are more mass complete down to lower masses.Taken together with IRAC, these images provide reliable detections of the reddest, oldest, and heavily dust obscured sources not seen in COS-MOS2015.
Although the nominal area of the COSMOS survey is 2 deg2 , the most secure region comprises 1.279 deg 2 after removing contamination due to bright star halos and requiring an intersection of the deep near-infrared UltraVISTA imaging and the Subaru Suprime-Cam intermediate bands, where the photo-z performance is generally best.Compared to Davidzon et al. (2017) who use UltraVISTA DR2 (McCracken et al. 2012) imaging to probe the near-IR, here we rely on UltraVISTA DR4 (Moneti et al. 2019), reaching greater and simultaneously uniform NIR depth across the entire field (especially K s ).These non uniform depths restricted the SMFs of Davidzon et al. to only the four deepest stripes in UltraVISTA spanning 0.62 deg 2 .The gains leveraged in this work are hence five-fold.Firstly, the total number of recovered sources at all apparent magnitudes increases due to the additional area, improving statistical margins especially of rare sources.Secondly, the increased optical and nearinfrared depths permit the recovery of fainter sources, improving flux completeness.Thirdly, the increased depth reduces photometric uncertainties leading to more precise, less biased photo-z and M. Fourth, the consistent depth and improved imaging performance from Hyper Suprime-Cam provides significantly better photometry relative to Suprime-Cam.Lastly, the wider field makes it less likely to become biased due to specific structures along the line-of-sight (e.g., clusters, voids) and instead probes a greater variety of environments affecting the evolution of galaxies within them.
As presented in Fig. 13 of Weaver et al. (2022), the best photo-z performance at i > 22.5 AB is achieved by the combination of The Farmer photometry and photo-z from LePhare with a δz/(1 + z) precision of < 1% at i ≈ 20 and < 4% at i ≈ 26 AB.For this reason, and to expedite comparisons with the most similar work in the literature (Ilbert et al. 2013;Davidzon et al. 2017), this work adopts the photometry from The Farmer paired with the photo-z, M, and rest-frame magnitudes estimated by LePhare.This combination will be henceforth referred to as COSMOS2020, unless explicitly stated otherwise.Our photo-z estimates correspond to the column LP_zPDF, defined as the median of the redshift likelihood distribution L(z) (often referred to as p(z)).We choose to adopt M estimates from MASS_MED as the median of the mass likelihood distributions are generally less susceptible to template fitting systematics than MASS_BEST which is taken at the minimum χ 2 SED .We find that the two agree at all redshifts within 0.01 dex with a narrow scatter whose 68% range is below 0.05 dex.
Identification of stellar contaminants proceeds as described in Section 5.1 of Weaver et al. (2022).Briefly, LePhare computes the best-fit stellar template to each SED from a range of templates, including white dwarf and brown dwarf stars.Sources that achieve a χ 2 from a stellar template fit lower than any galaxy template are removed.An additional criterion on morphology is used, identifying likely stars as point-like sources from the COSMOS HST/ACS mosaics (Koekemoer et al. 2007) and Subaru/HSC (PDR2, Aihara et al. 2019) for the i < 23 and i < 21.5 AB, respectively.Bright, resolved sources with SED shapes similar to stars are not removed.
An initial sample of 636 567 galaxies 2 in the range 0.2 < z ≤ 6.5 is selected from the contiguous 1.27 deg 2 COMBINED region defined as the union of the UltraVISTA and Subaru/SC footprints but removing regions around bright stars3 .However, the source density at 6.5 < z ≤ 7.5 becomes noticeably greater in the ultra-deep stripes and so we restrict sources in this particular z-bin to only the 0.716 deg 2 of the ultra-deep stripes also covered by Subaru/SC but away from bright stars, finding 1 327 galaxies.The construction of a robust stellar mass function requires precise redshifts and accurate stellar masses.Since an infrared detection is required to accurately measure galaxy stellar masses at z ≳ 2−3, where this study seeks to cover new ground, 227 305 sources with m ch1 > 26 AB (i.e., S /N ≲ 5) are removed; ≈ 93% of which fall below the mass completeness limit introduced in Section 3.3.An additional 13 644 sources with ambiguous redshifts are removed, quantified for simplicity by having > 32% of their redshift probability lying outside z phot ± 0.5; ≈ 70% of which fall below the mass completeness limit.Finally, we remove a further 1 850 sources with unreliable SED fits with reduced χ 2 > 10.These three necessary cuts create a final sample of 395 095 galaxies with precise redshifts and accurate stellar masses.We note that many excluded sources fail more than one of these three criteria (e.g., 90% are undetected in the infrared); Fig. A.1 shows their cross-section.Although the rejected objects are potentially interesting, their deeply uncertain nature does not permit us to reliably incorporate them into our measurements without a substantially more complex approach (e.g., to model potential outliers, see Hogg et al. 2010).Overall the excluded sources are predominantly faint (median K s ∼25.7 AB) such that their best-fit masses are generally below our adopted mass completeness limit and their potential for bias minimal within the mass ranges examined in this work.

Quiescent galaxy selection
Accurate identification of distant and/or unresolved quiescent galaxies has been a longstanding problem (see e.g., Leja et al. 2019b;Shahidi et al. 2020;Steinhardt et al. 2020).While quiescent galaxies can be identified by their low star formation rates, SFR estimates from fitting broad-band photometry are subject to large uncertainties and are dependent on the particular SED template library (Pacifici et al. 2015;Leja et al. 2019a;Carnall et al. 2019).The other most successful approach is that of restframe colors, as the lack of blue O-and B-type stars in quiescent populations implies a dearth of UV emission and a red optical contintuum.Similar photometric UV to NIR SEDs are measured for heavily dust-enshrouded but otherwise star-forming systems, making the challenge of color-color selection that of distinguishing truly quiescent galaxies from dusty star-forming ones, in the absence of FIR measurements.Popular rest-frame color-color selections include (U − V) vs. (V − J), (NUV − r) vs. (r − J), (NUV −r) vs. (r − K), and (FUV −r, r − J) (Williams et al. 2009;Ilbert et al. 2010;Arnouts et al. 2013;Leja et al. 2019b, respectively), in addition to observed-frame methods such as (B − z) vs. (z − K) for sources at 1.4 ≲ z ≲ 2.5 (Daddi et al. 2004).However, each implies a different definition of quiescence (see Leja et al. 2019b;Shahidi et al. 2020).These pairs of rest-frame colors may be measured directly from the observed photometry (where available) or derived from the best-fit templates that should largely agree with observed-band photometry, and can also be meaningfully extrapolated in cases where appropriate observed-frame data are not available.While the former may be adversely affected by uncertainties and systematics propagated from the observed photometry, the latter assumes that the SED template library or basis contains a suitable model.It is worth noting that efforts are being made to employ machine learning techniques to lessen the dependence and bias impact from model assumptions (e.g., Steinhardt et al. 2020), although no such method has gained comparable use as yet.
Recent work by Leja et al. (2019b) has clarified the efficacy of these color-color selections, proposing to extend the typically adopted U − V vs. V − J baseline further into the UV on the basis of stronger correlation with intrinsic sSFR as measured in simulated photometry (see also Arnouts et al. 2013).As such, catalogs containing reliable and deep FUV and NUV photometry have the advantage of being able to identify low-z quiescent populations which though extrapolation can be consistently identified up to the highest redshifts even when the observedframe U band no longer corresponds to rest-frame U.At redder wavelengths the rest-frame J band is crucial for measuring the rest-frame stellar bulk, however, it becomes easily redshifted out of most deep surveys with IRAC photometry by z ≈ 2 − 3 which renders quiescent galaxy selections highly dependent on template libraries even at these intermediate redshifts.
In order to provide broad comparisons to other mass functions, in particular those measured in COSMOS (e.g., Ilbert et al. 2013;Davidzon et al. 2017), this work selects quiescent galaxies following the criteria introduced by Ilbert et al. (2013): (NUV − r) > 3 (r − J) + 1 and (NUV − r) > 3.1 (1) whereby the slant line runs perpendicular to the direction of increasing sSFR and parallel to an increase in dust attenuation to separate truly quiescent systems from otherwise dusty starforming contaminants.For clarity, we henceforth refer to this (NUV − r) vs. (r − J) selection as NUVrJ.Generally, this selection has been found to approximate a cut in sSFR ≲ 10 −11 yr −1 (Davidzon et al. 2018).
It is important to note that LePhare computes rest-frame magnitudes by attempting to find an observed frame band which directly probes the rest-frame band, which for the reasons listed above is less model-dependent than simply adopting the restframe colors of the best-fit galaxy template (or sSFR) and hence conveys a view of the true variety of observed galaxies that is less biased by our assumptions.Although the rest-frame photometry are estimated using a K-correction and color-term, they are most strongly dependent on the best-fit template at redshifts where observed photometry are not well matched to the rest-frame band and are then effectively extrapolated from the best-fit template (see Appendix 1 of Ilbert et al. 2005 for details).
Unbiased sample selection becomes increasingly difficult with redshift.By z ≈ 3, rest-frame J-band fluxes are no longer directly measured even by IRAC channel 24 and so become increasingly model-dependent and uncertain.While rest-fame NUV remains constrained even at z > 6, rest-frame r must be extrapolated by z ≈ 5 at which point differentiation between quiescent and dusty star-forming galaxies becomes statistically impossible.Because of this, and the expectation that quiescence is unlikely at such early times, selection of quiescent galaxies in this work is limited to z ≤ 5.5, noting that selection between 3 < z ≤ 5.5 is subject to a significantly higher degree of uncertainty.Advancement in this selection will only be made possible with deeper infrared imaging from facilities such as JWST (Rigby et al. 2023).
Selection of quiescent galaxies is presented in Fig. 1, showing star-forming and quiescent galaxies whose masses are above their respective mass limits (see Section 3.3).Error bars on the rest-frame colors are estimated from the quadrature addition of the observed-frame filter nearest to that of the rest-frame.Given that even rest-frame photometric uncertainties are generally inversely proportional to mass at given redshift, these error bars are most representative for median mass systems but are overestimated for bright, massive galaxies.In addition, uncertainties become increasingly underestimated at z > 3 as the extrapolation based on the best-fit template is not propagated.However, it is already obvious that the dividing power of the slant line is significantly diminished at z > 3 where the rest-frame J band is no longer directly observed.Both the quiescent and dusty starforming regions appear to be devoid of sources by z ≈ 6, such that no intrinsically quiescent (misclassified or not) are found at these early times.Whether this is an intrinsic feature of galaxy populations or merely a selection effect is discussed below in Section 3.3.For these reasons this work restricts distinction between quiescent and star-forming galaxies to z ≤ 5.5.Galaxies at z > 5.5 in this work should be considered star-forming.

Completeness
Accurate estimates of completeness are crucial when inferring general properties about a population from an otherwise incomplete sample.While advancements in near-infrared facilities have enabled breakthroughs in selecting representative sample of galaxies by measuring their stellar bulk, samples are still mass limited and these mass limits evolve with z, owing to the faintness of increasingly distant galaxies.As such, mass limits are highly dependent on accurate estimates of survey depths and their impact on the selection function as it relates to the detection of the lowest mass galaxies.
There are three known populations which are expected to be missed by a near-infrared selection whose deepest images are on the blue-end of the selection function.Firstly, the faintest starforming blue galaxies in the local Universe (e.g., dwarf irregular starbursts) may have detectable fluxes blueward of i, but they will not be included in existing izY JHK s selections as their near-infrared emission is too faint.However, their contribution is expected to be limited to only the low-mass end of the galaxy stellar mass function, and can henceforth be well characterized in large numbers by deeper small field surveys (e.g., CANDELS: Grogin et al. 2011;Koekemoer et al. 2011).
Secondly, quiescent galaxies are more difficult to detect than star-forming galaxies owing to a lack of rest-frame UV and blue optical emission, meaning their detection relies upon deep nearinfrared imaging at both low and high-redshifts.For this reason, an izY JHK s detection will be insufficient to detect the quiescent systems compared to star-forming ones of the same mass, meaning that the quiescent sample will be as complete as the starforming sample at higher masses.We compute their respective   Ilbert et al. (2013).The rest-frame J band is not directly measured at z ≳ 3.5 and so the sloped part of the selection box is dashed to indicate that the r − J colors are strongly model dependent.Typical uncertainties for the rest-frame colors of the star-forming and quiescent subsamples are estimated as medians of the nearest observed-frame bands, consistent with the derivation of the rest-frame colors.See the text for details.mass completeness limits separately and apply them consistently throughout.
Lastly, and similar to quiescent galaxies, the most heavily dust obscured star-forming galaxies (A V ≫ 5) at high redshift (z ≳ 2) will not present appreciable optical or near-infrared fluxes to be detected in COSMOS2020, but unlike quiescent galaxies are ubiquitously and efficiently detected in far-infrared, submillimeter, and radio surveys (e.g., Schreiber et al. 2018c;Wang et al. 2019;Sun et al. 2021;Jin et al. 2019;Fudamoto et al. 2020Fudamoto et al. , 2021;;Casey et al. 2021;Shu et al. 2022).The nature and extent of this recently discovered population remains difficult to quantify, owing to a combination of complex selection functions and serendipitous detections.The majority are likely to be missed by an izY JHK s selection function even at the depths of COSMOS2020 (see discussions in Section 6.3).
Although mass completeness estimates are presented in Weaver et al. (2022), they are derived for a comparably less secure sample than used in this work.For this reason the mass completeness limits are rederived in identical fashion following the method of Pozzetti et al. (2010).A critical advantage of this method is that it does not rely on the theoretical mass distribution of galaxies fainter than the magnitude limit, but assumes that those just above the threshold share similar properties with the undetected ones, modulo a rescaling factor as detailed below.This contrasts with studies that estimate mass completeness either through injection-recovery of simulated sources, or extrapolation of the observed distribution itself below the magnitude limit.The latter approach, in particular, may underestimate the stellar mass limit if the galaxy sample is sparse due purely to astrophysical reasons and not genuine incompleteness.In our case, the sample in each z-bin is first cleaned by discounting the 1% worst-fit sources via χ 2 .The stellar masses M of the 30% faintest galaxies in channel 1 are then rescaled such that their observed channel 1 apparent magnitude m ch1 matches the IRAC sensitivity limit m lim = 26: Log 10 (M resc ) = Log 10 (M) + 0.4 (m ch1 − 26.0) . ( The limiting mass M lim is then taken to be the 95 th percentile of the M resc distribution.Finally 2 nd order expansions in (1 + z) are fitted to each M lim per z-bin up to z = 7 for the total sample, and z = 5 for star-forming and quiescent samples, to produce a smoothly evolving limiting mass.Limits above which samples are ∼ 70% mass complete (see justification below) are derived consistently for the total sample from 0.2 < z ≤ 7.5 and for the star-forming, and quiescent samples from 0.  with magnitude limits adopted from IRAC channel 1 (filled circles), then interpolated with Eqs. 3, 4, and 5, respectively, resulting in the solid lines.For quiescent galaxies, consistent estimates using K s are also shown (empty circles).To verify these estimates we compute conservative mass limits from simulated spectra following a delta-like burst at z = 15 which are normalized separately to the K s and channel 1 magnitude limits; they are shown by the dotted and solid red lines, respectively.See the text for details.
and are shown in Fig. 2. Despite the more conservative selection adopted in this work, the derived mass completeness limits are essentially identical to those derived in Weaver et al. (2022), which indicates the robustness of these limits against sample selections.For reference, at z ≈ 1 the quiescent sample is complete at ∼ 0.25 dex in mass above the star-forming sample, growing to ∼ 0.45 dex above by z ≈ 3 − 5.
There remains an additional incompleteness arising from the fact that mass completeness is derived from IRAC channel 1 photometry, and yet our izY JHK s selection function does not include sources which are only identified in IRAC.As discussed in Davidzon et al. (2017), despite this drawback it is also disadvantageous to estimate the mass completeness from any one of the six izY JHK s detection bands either.For instance, the reddest, K s , samples the rest-frame stellar bulk only out to z ≲ 2, and will tend to underestimate stellar masses at z ≳ 2 until z ≈ 4.5 when it becomes a UV tracer.Thankfully, it is possible to estimate the impact of this additional incompleteness by examining a sample of galaxies common to this work and those of the comparably deeper ∼200 arcmin 2 CANDELS-COSMOS catalog of Nayyeri et al. (2017).This analysis is performed and discussed at length in Weaver et al. (2022).In summary, 75% of CANDELS sources at M lim are recovered by both The Farmer and Classic COS-MOS2020 catalogs, which combined with the choice of M lim being the 95 th percentile of M resc implies that M lim of the total sample corresponds to a mass completeness of ∼ 70%.
As shown in Fig. 2, the mass limit of the total sample is almost identical with that of the star-forming sample at all redshifts.This is unsurprising as star-forming galaxies generally dominate galaxy demographics.As in our NUVrJ selection from Fig. 1, we see the lack of quiescent systems at z > 5.5 and so we consider the star-forming subsample to be statistically equivalent to the total sample at z > 5.5.Deriving a consistent mass completeness limit for quiescent galaxies is more challenging.It is well known that the predominantly older, redder stellar populations of quiescent systems imply higher mass-to-light ratios than found in star-forming galaxies which in turn imply a lower degree of mass completeness at the same flux limit.Fig. 2 shows that our channel 1 derived mass completeness falls ≳ 0.2 dex lower than the bulk of the quies-cent sources.Taken at face value, this would appear to indicate a lack of quiescent systems at low-intermediate masses at z > 2. Again, we have cause for concern as channel 1 is not included in our izyJHK s detection strategy, and hence our selection function is liable to miss IRAC-bright sources that are not present in bluer bands.Furthermore, due to the predominantly red optical spectral slopes of quiescent galaxies, continuum emission in K s should be lower than in channel 1, implying that one would need a deeper K s magnitude limit to detect the same sources from shallower channel 1 imaging.In other words, it is expected that there are red sources visible only in channel 1 that are not included in our izY JHK s detected catalog.Furthermore, by z ≈ 4.5 K s no longer traces mass but rather UV continuum (channel 1 does so by z ≈ 8).Worse, the UV continuum in these systems is expected to be faint (even given 'frostings' of star formation in NUVrJ-selected post-starburst systems).Therefore, while deriving a mass completeness from channel 1 magnitude limits may be suboptimal due to possible selection effects with regards to a izY JHK s -selected sample, we cannot turn to K s as it is no longer a mass indicator at z ≳ 4.5.
One solution is to modify our selection function by incorporating izY JHK s undetected IRAC-only sources into our sample.While a systematic search for IRAC only sources is ongoing, their identification is made difficult not only because of the significantly lower resolution and consequently higher source crowding in necessarily deep IRAC images, but these sources also lack optical/NIR data which is, by definition, insufficiently deep to identify low-z interlopers.Deep mid-infrared (MIR) data redward of IRAC over sufficient degree-scale areas do not currently exist, making a determination of redshift, mass, and quiescent nature of these IRAC-detected sources problematically uncertain.
For the time being, whether or not this absence of intermediate-low mass quiescent systems is astrophysical cannot be determined.Literature measurements do not yield more masscomplete quiescent samples as the UltraVISTA DR4 NIR depths are now similar to those from even the deepest small-field NIR imaging, H 160 ≈ 25.9 (e.g., CANDELS, Tomczak et al. 2014), and even so, comparisons are hampered by field-to-field and photometric systematics.Comparisons with other stellar mass functions measured more consistently in COSMOS (e.g., Davidzon et al. 2017;Ilbert et al. 2013) are still affected by systematics from comparably less certain measurements from previous, shallower data, despite the lessened impact of cosmic variance.Additionally, comparisons with Davidzon et al. (2017) are complicated by the fact that they refit the photometry of Laigle et al. (2016) to produce new z and M estimates.Modifications include expanding the redshift baseline from z ≤ 6 to z ≤ 8 and adding in additional SED templates to probe star-bursting galaxies with rising star formation histories as well as dust-obscured systems.As demonstrated by a recent comparison by Lustig et al. (2022), these modifications produce lower number densities of massive galaxies compared to the original measurements.
We turn to theoretical frameworks to investigate this further.Namely, we use Flexible Stellar Population Synthesis (FSPS; Conroy et al. 2009;Conroy & Gunn 2010) to estimate the stellar masses produced by extraordinarily old stellar population produced by a delta-burst evaluated at z = 15.We assume a Chabrier ( 2003) IMF with log 10 (Z/Z ⊙ ) = −0.3.At each epoch (i.e., z) we normalize the evolved model spectrum to match our channel 1 observed frame magnitude limits (26.0 mag) and estimate the corresponding stellar mass shown by the solid red curve in Fig. 2.This exercise is repeated consistently with the K s magnitude limit (25.5 mag) shown by the dashed red curve.
We caution however, that the assumption that these models accurately describe real high-z quiescent galaxies is becoming increasingly dubious.Such a system formed in a monolithic delta-burst just following the big bang cannot reach quiescence (as defined by NUVrJ) above z ≈ 5.3, and yet remarkably mature systems at z ≈ 4 − 5 have already been reported in the literature (e.g., Schreiber et al. 2018a;Tanaka et al. 2019;Valentino et al. 2020).Worse, the spread of mass-to-light ratios found in quiescent systems means that a single mass completeness limit for all quiescent galaxies at a given redshift is ill-defined even for consistently selected (e.g., UV J, NUVrJ, sSFR) samples, resulting in a non negligible selection effects.
Nonetheless, we stress that the difference in completeness between the effectively flux-complete K s -derived mass limit and the mass limit derived from channel 1 magnitudes is only ∼ 0.3 dex, which is typically less than a single bin in our analysis.In light of this considerable uncertainty, we adopt the optimistic quiescent galaxy mass completeness limits derived via Pozzetti et al. (2010) from channel 1 magnitudes with the caveat that the lowest mass bins in each measurement at z ≳ 2 are potentially incomplete.We indicate the K s mass limit for quiescent samples throughout.Also, we note that our quiescent mass limit, although consistently determined in Davidzon et al. (2017), is more conservative despite the deeper NIR data.As will be discussed in Section 5.2, we attribute this to an overestimate of the quiescent galaxy mass completeness by Davidzon et al..The difference in mass completeness between the starforming and quiescent samples presents an additional complication.Because the star-forming galaxies can be reliably detected to lower masses than quiescent galaxies from our izY JHK s selection function, the low-mass regime of the total SMF at a given epoch, as defined in this work, does not actually include contributions from the lowest-mass quiescent systems.Therefore the total SMF is simply the star-forming SMF at masses below the quiescent mass limit.The effect from the undetected low-mass quiescent systems is almost certainly negligible: a simple extrapolation from the evolution in the number density of z ≈ 0.2 − 2.5 low-mass quiescent galaxies predict a number density at least 10× lower than that of the star-forming galaxies at z > 3. Furthermore, given that general uncertainties are still above the 10% level at low-masses even at z ≈ 2 (Fig. 3), any bias arising in the total number density from undetected low-mass quiescent systems is lower than other sources of error.

Derivation of the 1/V max correction
Intrinsically faint galaxies at any given redshift are more likely to be missed by survey selection functions compared to brighter sources.For a NIR-selected sample, this equates to a mass bias by which low-luminosity, low-mass galaxies can only be detected in smaller volumes (i.e.out to lower z) relative to brighter, more massive ones which could be detected if they were at higher z.This is the well known Malmquist bias (Malmquist 1920(Malmquist , 1922)).
The most straight-forward approach to correct for such a bias is the 1/V max method of Schmidt (1968), which has enjoyed significant popularity owing to its simplicity.Briefly, the 1/V max method statistically corrects for selection incompleteness by weighting each detected object by the maximum comoving volume in which it can be observed, given the characteristics of the telescope survey.The V max estimate per individual object is computed after finding the maximum redshift z max by which the best-fit SED would no longer be observable5 because of the survey's flux limit.On the other hand, the minimum redshift (z min ) should be the one at which the source would become too bright and saturate the camera, although in practice is the lower boundary of the z bin in which the considered galaxy lies (z low ).Therefore, for the i-th galaxy inside the bin z low < z < z high , the maximum observable volume is where Ω is the solid angle subtended by the sample, Ω sky ≡ 41 253 deg 2 is the solid angle of a sphere, and D cov (z) is the comoving distance at z (Hogg 1999).If z max exceeds the upper boundary of the redshift bin, the latter is used instead, meaning that the brightest sources are often assigned a weight that corresponds to the full volume of that redshift slice.As such, this correction is expected to be significant for only the faintest, lowest-mass sources in a given z-bin.While it is nonparametric and does not assume a functional form of the SMF, the 1/V max technique does assume that samples are drawn from a uniform spatial distribution, which is not accurate in the case of over-or under-dense environments (Efstathiou & Bond 1999).However, the assumption of a uniform spatial distribution is expected to be problematic only at z < 1, where large-scale structures have fully assembled, or in narrower surveys that can be biased by structures at smaller scales.Other methods exist which do not make this assumption such as STY (Sandage et al. 1979) and SWML (Efstathiou et al. 1988), a parametric and nonparametric maximum likelihood method, respectively.Already, Davidzon et al. (2017) found that the constraints provided by COSMOS2015 were sufficiently strong for the 1/V max method as well as more complex methods (e.g., STY, SWML) to essentially converge.With even stronger constraints and larger effective area provided by COSMOS2020, we can expect even better agreement, with minimal advantages to the more complex methods.More extensive discussions on strengths and weaknesses of the various approaches can be found in the literature (e.g., Ilbert et al. 2004;Binggeli & Jerjen 1998;Johnston 2011;Takeuchi 2000;Weigel et al. 2016).Fractional Error ( / ) 1.1 < z 1.5 2.0 < z 2.5 3.5 < z 4.5 4.5 < z 5.5 5.5 < z 6.5 6.5 < z 7.5 N CV SED Fig. 3: Adopted estimates for the total uncertainty σ Φ (solid) as a function of stellar mass at several redshifts (by color), for masscomplete samples only.Contributions include uncertainties from Poisson noise σ N (dashed), Cosmic Variance σ CV (dotted), and SED fitting σ SED (dash-dot).10% and 100% levels are indicated for reference.

Further considerations of uncertainty
We adopt a statistical error budget on the SMF number density Φ consisting of the quadrature addition of Poisson noise (σ N ), cosmic variance fluctuations (σ cv ), and uncertainties on masses induced by SED fitting (σ SED ) such that σ Φ = (σ 2 N +σ 2 cv +σ 2 SED ) 1/2 .Fig. 3 shows the composition of the total error budget from z = 1.1 to 6.5 as a function of stellar mass for mass-complete bins.

Poisson noise
Poisson noise arises from processes wherein the abundance of a discrete quantity (or counts) is measured.Although in the limit of many events a Poisson process becomes indistinguishable with that of a Gaussian, in small number counts it can be the dominant source of uncertainty.Here we compute the Poisson error σ N for each mass bin as

√
N where N is the number of objects in that bin.These values are recomputed for the star-forming and quiescent subsamples separately.As shown in Fig. 3, the fractional contribution from σ N increases with mass and redshift with the largest contribution at M > 10 11.5 M ⊙ .
The discrete nature of Poisson processes allows us to also provide upper limits on bins containing zero detected galaxies.Following Table 1 of Gehrels (1986), the statistical upper limit on Φ (N = 0) for a given observed volume V is σ N,limit = 1.841/V.See Ebeling (2003) and Weigel et al. (2016) for details and further discussions.

Cosmic variance
It is well established that galaxy properties are correlated with environmental density (i.e., clustering).Galaxy clusters, while being an important laboratory for galaxy evolution, are not typical of galaxy environments.Because of their density, they im-part a higher overall normalization to the stellar mass function.More noticeably, they tend to inflate the massive end of the mass function as they preferentially contain the most massive systems.This environmental field-to-field bias (so-called "cosmic variance") is a topic of intense study, and is a key component to accurately assessing sample uncertainties when trying to infer universal or intrinsic properties of galaxies.
Cosmic variance σ cv is estimated following Steinhardt et al. (2021), who adapt the methods of Moster et al. (2011) which, importantly, scale with stellar mass (up to 10 11.25 M ⊙ ) and are commonly adopted for use in 0.1 ≲ z ≲ 3.5 measurements as that is the redshift range in which the Moster et al. calculator was devised.However, above z ≈ 3.5 these estimates become increasingly underestimated, so Steinhardt et al. use linear perturbation theory to extend this work more reasonably to the early Universe, while maintaining agreement at z < 3.Although environmental density has known covariance with star formation (e.g., Davidzon et al. 2016;Bolzonella et al. 2010), we assume cosmic variance is equivalent between star-forming and quiescent subsamples.As shown in Fig. 3, the fractional contribution from σ CV is dominates the error budget for low-M systems at all redshifts and becomes progressively more important at high-M with increasing z.

SED fitting uncertainties and bias
Another consideration is the uncertainty on the stellar mass estimate provided by the SED fit.Fig. 4 shows the likelihood distributions on stellar mass at fixed redshifts and masses produced by LePhare sampling at ∆ log 10 (M/M ⊙ ) = 0.025 dex.Trends with width of the likelihood distributions indicate that mass is best constrained for low-redshift, massive (i.e., bright) sources.Although there is nonzero skewness and kurtosis in individual cases, the overall median distribution is symmetric.This is expected, as the uncertainty on stellar mass is essentially a measurement of the range of allowable templates and their normalization in the fitting procedure, and thus σ SED scales with photometric uncertainties.However, these likelihood distributions on stellar mass should be treated as lower limits as they do not take into account any covariance with redshift.
While these typical per-bin distributions can be valuable, especially for injecting noise into measurements from simulations, attempting to compute its contribution to the SMF, σ SED , using the typical width in a given bin is suboptimal as the wings of neighboring mass-bins contribute asymmetrically.To address this, we use the individual mass likelihood distributions to draw 1 000 independent realizations of the galaxy stellar mass function and thereby directly estimate the variance produced by the mass uncertainties, which we take as the 68% range about the median number density per bin of mass.As shown in Fig. 3, the contribution from σ SED become dominant only at M > 10 11.5 M ⊙ , in some cases becoming larger than unity.They are comparable to contributions from σ N across the entire mass range.
It is important to note that this does not account for systematic biases arising from SED fitting, such as assumptions of the stellar initial mass function6 , potential photo-z offsets, and assumptions as to the star formation histories that all propagate into the stellar mass determination.However, concerning the latter case, we show in Weaver et al. (2022) that the combination of The Farmer and LePhare achieves a subpercent photo-z bias even for faint, high-z sources (−0.004δz/1+z at 25 < i < 27) improving over other works including COSMOS2015.Systematic errors cannot be combined with random errors, and so additionally complicate measurements of the SMF.Given this indication of relatively low bias arising from the photo-z and significantly better constrained SEDs relative to previous measurements, we omit these considerations in the present work.See Marchesini et al. (2009) and Davidzon et al. (2017) for detailed discussions of various sources of bias and their effect on the SMF.

Eddington bias
The number of low-mass galaxies is orders of magnitude larger that of the highest-mass systems, and so a randomly chosen galaxy is overwhelmingly likely to be lower-mass.If even a small fraction of such truly low-mass systems scatter toward high-mass (owing to a M overestimate) it can significantly change the poisson-dominated high-mass number density estimate.The converse situation, while depleting the high-mass end, would have virtually no effect on the low-mass estimates.This is the well known Eddington bias (Eddington 1913).While generally understood to mean that there is a net bias leading to the overestimation of the density of massive galaxies, a small, but highly asymmetric uncertainty on the mass of low-mass systems can similarly generate such a bias that affects the shape of the low-mass regime.See Grazian et al. (2015) for further discussions.
Effectively correcting for Eddington biases has been a leading point of discussion in recent literature, generally favoring the

The Schechter function
Galaxy luminosity and stellar mass functions can be described empirically by the parametric formulation first introduced by Schechter (1976) in the context of the local Universe.Since then, the Schechter function has been adopted ubiquitously in statistical studies of galaxy mass assembly.It is more convenient to express the number density of galaxies per logarithmic mass bin d log M as given by Weigel et al. (2016): which describes a power law of slope α at masses smaller than a characteristic stellar mass M * , whereupon the function is cut off by a high-mass exponential tail.The overall normalization is set by Φ * , which corresponds to the number density at M * .Although Φ (M) is expected to evolve smoothly with redshift such that Φ (M, z) can be mapped from secondary functions M * (z), Φ * (z), α(z), most literature applications fit for each parameter independently at each redshift.A notable exception is Leja et al. (2019a).
Many studies have since found evidence that the total galaxy population at low-z is better described as the coaddition of two Schechter functions (e.g., Peng et al. 2010) whereby the low-mass and high-mass end acquire individual normalizations (Φ * 1 and Φ * 2 ) and low-mass slopes (α 1 and α 2 ) but retaining a single characteristic stellar mass M * .This so-called 'Double' Schechter form is as follows: The Double Schechter function has been found to be as suitable description in most of the studies in the local Universe and up to z ≈ 2. At higher redshifts, it is less clear whether this is a better description of Φ than a single Schechter function.Moreover, deviations have been recently been reported at z > 7 (e.g., Stefanon et al. 2021) in which the observed stellar mass function is better described by a power law than by any Schechter profile.

Results
Now having established the selections and methods adopted in this work, we present the resulting measurement of the SMF for the total, star-forming, and quiescent samples.We investigate also the evolution of number densities and quiescent fractions at fixed mass, and then fit the SMF for each sample with several methods to derive the evolution of the best-fit Schechter parameters.

The total galaxy stellar mass function
We measure the SMF for our total (star-forming and quiescent) sample divided in 12 redshift bins from z = 0.2 to 7.5 in fixed bins of stellar mass ∆M = 0.25 above the mass limit for that zrange7 .Shown in the left panel of Fig. 5, the shape and normalization of the SMF changes considerably over the ∼ 10 billion years corresponding to this redshift range.At z ≲ 3, the SMF features a smooth, monotonically decreasing number density at low-M that flattens before falling off steeply at M ≈ 10 11 M ⊙ , around the so-called 'knee' of the function.Its overall shape and normalization remains roughly constant until z ∼ 1.5 indicating a lack of mass growth at recent times, consistent with the decline of the cosmic star formation rate (Madau & Dickinson 2014).However, by z ∼ 1.5 the normalization decreases dramatically.The number density at the knee is only ∼ 1% of its z ≈ 0.5 value with the fastest growth occuring on the low mass end, consistent with mass "downsizing in time" (Cowie et al. 1996;Neistein et al. 2006;Fontanot et al. 2009).The slope of the low mass end steeps with time and is expected to be driven by physical processes including outflows, star formation efficiency, slope of the main sequence, and mergers (see discussions in Peng et al. 2014).At the same time, the knee itself becomes difficult to distinguish as the SMF takes the form of a smooth power law, and we become increasingly unable to constrain the low-mass end of the SMF due to worsening mass completeness (mass incomplete points are omitted in the figure).As expected from Fig. 3, the overall uncertainty grows significantly with increasing redshift and mass.We note that at a few epochs (e.g., z ≈ 5), the SMF is not strictly monotonic, likely driven by systematics rather than a real physical phenomenon (see discussions in Section 6).The evolution of the total SMF measured in this work is compared with literature results in Fig. 6.We begin at z ≈ 0.  Moneti et al. 2019) and so can leverage an area almost 2× larger out to at least z ≈ 6.5, beyond which the source density becomes clearly different between the deep and ultra-deep stripes.Thus, for the 6.5 < z ≤ 7.5 bin we use the 0.72 deg 2 subset of our primary area covered by the UltraVISTA deep stripes.
At z < 2.5, in the range that all three can be directly compared, we find excellent agreement with Ilbert et al. and Davidzon et al..This is unsurprising, as measurements from Davidzon et al. at z < 2.5 are adopted directly from COSMOS2015 and computed nearly identically as Ilbert et al. but with deeper imaging.Our measurements are similar, but with visibly less structure around the knee especially between 0.8 < z ≤ 1.1 where the volume density of sources is slightly higher, with the greatest increase at M ∼ 10 10 M ⊙ .However, at z ≈ 2. 5  -a trend made clearer when looking at quiescent systems in the Section 5.2.At z > 3.0, we observe a significantly higher volume densities of massive galaxies compared to Davidzon et al., although the general shape of the low-mass end of the SMF remains similar.However, if we similarly limit our SMF to only the Ultra-Deep region we find it decreases to about the existing lower 1σ limit at M > 10 11 M ⊙ .We speculate that this may be due to the presence of (proto-)clusters preferentially in the Deep region at 3 < z ≤ 4 (see Brinch et al. 2023), one of which has been recently spectroscopically confirmed by Mc- We do not include comparisons to stellar mass functions inferred from L UV -selected samples with stellar masses estimated from their UV luminosities (e.g., Song et al. 2016;Harikane et al. 2016) as UV-bright sources are only a portion of the gen-eral galaxy population and conversations between L UV and M are hazardous.Moreover, such studies often rely on color-color selections which are less certain than photo-z, in addition to being susceptible to a number of systematic effects when translating a UV luminosity function to a SMF by means of a L UV − M relation.We refer the reader to Davidzon et al. (2017) for basic comparisons and discussion.
While the smaller area surveys used by Grazian et al. (2015) and Stefanon et al. (2021) are more mass complete at lower masses, COSMOS2020 contains the widest contiguous NIR imaging at these depths and consequently provides a larger volume than any previous study (including Davidzon et al.) and so introduces new constraints on the rarest systems at all redshifts z ≤ 7.5 such as massive galaxies.Yet it appears that we are quickly approaching the statistical limit beyond which a larger survey is needed to find more massive systems, if they exist (e.g., Euclid, see discussions in McPartland et al. in preparation).The nature of these sources, their authenticity, and their potential Eddington bias is further assessed in Section 6.3.

The star-forming and quiescent galaxy stellar mass functions
In Section 3.2, we used an NUVrJ color-color selection to distinguish quiescent systems from star-forming ones and then estimated their respective mass completeness limits, which will differ since their M/L ratios are not the same.The corresponding SMFs are shown in Fig. 5 and compared to literature in Fig. 7.
Fitting results are discussed in Section 5.3 and details can be found in Appendix C; their corresponding Schechter parameters are reported in Tables C.2 and C.3 for the star-forming and quiescent samples, respectively.Fractional uncertainties on mass and cosmic variance are adopted from the total sample, leaving only the impact of poisson noise differentiated between the starforming and quiescent samples (see Section 3.5).
We can follow the development of quiescent systems out to z ≈ 5, although with significant uncertainty at z ≳ 4. As evidenced by Fig. 1, only ∼ 200 of quiescent systems are found at z ≳ 4, dropping precipitously to only one candidate by z ≈ 6.This is partly driven by mass completeness.At z < 3 the difference between the IRAC channel 1 mass limit and the effective mass completeness dictated by our izY JHK s selection function is ≲ 0.2 dex in mass.This difference grows to ∼ 0.4 dex at z ∼ 4.5 when K s falls blueward of the Balmer break causing the mass completeness to shift upwards to higher masses, indicated by the hatched region of the SMF of 4.5 < z ≤ 5.5 quiescent systems in Fig. 5.At this point the identification of quiescent systems is reliant on only a few bands, and is dependent on the particular SED templates (as suggested by the two distinct clusters in the upper right corner of the 4.5 < z ≤ 5.5 NUVrJ diagram), making such a distinction hazardous and susceptible to interlopers.
Our measurements of the star-forming and quiescent SMFs are generally in good agreement with other literature measurements in COSMOS, namely Davidzon et al. (2017) and Ilbert et al. (2013) as they are the only NIR-selected, mass-complete samples from which star-forming and quiescent subsamples are identified by NUVrJ.Other selections may induce additional systematics, and other separation methods (e.g., UV J, BzK, sSFR) implicitly adopt a different criteria for quiescence (see Davidzon et al. 2018;Leja et al. 2019b).As shown in Fig. 7, our measurements indicate similar high-M quiescent number densities compared to Ilbert et al. and Davidzon et al., but find significantly more low-mass lowest-M quiescent galaxies compared to Davidzon et al. (which are not measurable from the data used by Ilbert et al. 2013).Since our sample is derived from comparably deeper NIR data, it is expected to be complete down to lower masses relative to Davidzon et al.. Given the increased quiescent galaxy number densities near the mass completeness limit in our work, we conclude that the 70-80% completeness threshold of Davidzon et al. is underestimated by a factor of ∼ 2 − 10 across this z-range, and is more likely only 15 − 35% complete.We note, however, that this in agreement with worst-case scenario discussed by the authors (see Section 4.2 of Davidzon et al. 2017).
The SMFs of star-forming and quiescent galaxies have remarkably different shapes and evolutionary histories.As shown by Fig. 5, star-forming galaxies at 0.2 < z ≤ 0.5 follow a double Schechter form (Equ. 7) with a characteristic stellar mass M * ≈ 10 10−11 M ⊙ and yet by z ≈ 3 it appears to flatten into a smooth powerlaw-like form with a lower overall normalization.Meanwhile the SMF of 0.2 < z ≤ 0.5 quiescent galaxies follows the form of a double Schechter (Equ.8) with a low-mass upturn and a similarly positioned M * (see Moutard et al. 2016) but the form appears to loose its lower-mass Schechter component around z ∼ 2. Although this may be physical, the fact that the quiescent sample is less mass complete at a given z means that the low-mass end of the total sample reflects only the contribution from star-forming systems.However, the contribution from low-mass quiescent systems, if they exist, can be expected to be < 1%.
This evolutionary picture is perhaps more easily understood by Fig. 8.Here galaxies have been binned by mass instead of redshift, allowing for a more direct comprehension of the growth, or lack thereof, in the number density of galaxies at fixed mass.We first notice a growth rate in the number densities that is strikingly consistent across all masses from z = 7.5 → 1 (i.e., similar slopes in each mass bin), constituting ∼ 5 Gyr or ∼ 36% of the history of the Universe, consistent with recent findings by Wright et al. (2018).This consistent growth is made especially clear in the lower left panel where the growth is relative to that of the total sample at z 0 ≡ 0.2 < z ≤ 0.5.Although the uncertainties are considerable, there may be a hint that the growth is fastest (i.e., the slope is higher) for systems at M ∼ 10 10.5−11.0M ⊙ or at the 'knee' of the mass function where star formation is hypothesized to be the most efficient (Moster et al. 2013;Behroozi et al. 2013;Wechsler & Tinker 2018).However, we note that Eddington bias remains uncorrected and so a growth in the bias strength with redshift may produce this apparently slow evolution (as suggested by Fig. 4).While the least massive systems are always more common than more massive ones, this is simply a consequence of the monotonic shape of the SMF.
Another interesting feature is that the number density of M = 10 9.5−10.0and M = 10 10.0−10.5 M ⊙ systems appear to decrease mildly at z ∼ 2.5.There are fewer such systems in this work relative to both Davidzon et al. and Ilbert et al., and so this may be indicative of an incompleteness in our selection, or alternatively a bias in z and/or M. Determining the origin of this difference is nontrivial, and so we simply caution against overinterpretation of this feature.
The highest-z constraints are difficult to interpret due to the incompleteness of low-mass systems (which are omitted) and the rarity of the most massive sources.The fact that the constraints at z ≳ 5 overlap with the Φ(N = 0) upper limit region in gray indicates that even the comparably large, deep volume of COSMOS is insufficient to provide robust measurements of the number density of massive galaxies at these early times.Future constraints will be derived from even larger volumes, which are expected to be delivered soon by near-infrared wide-area deep surveys such as Roman and Euclid.
This nearly uniform growth in the number density of sources slows down at z ≲ 1 in all but the least massive bin, with almost no growth for M > 10 10.5 M ⊙ systems.To understand this further, we examine the growth of the number densities of the star-forming and quiescent subsamples.As seen in the middle column of Fig. 8, star-forming systems maintain their monotonic mass-ranked order.In addition, their number density grows rapidly until z ≈ 1, when it starts to decrease.As shown in the rightmost column, quiescent galaxies follow a different pattern.Instead of a monotonic, mass ranked growth in number density, quiescent systems around the knee at 10 10.5 M ⊙ are always the most numerous, with a slower rate of growth compared to the least massive and most massive bins.While the evidence for massive quiescent systems at z > 2 is hampered by the limited volume of COSMOS, we can confidently see that they appear to grow quickly in number at z < 2. By z ≈ 1 the number density of the most massive ones (M = 10 11.5−12.0M ⊙ ) stabilizes at ≈ 3 × 10 −5 Mpc −3 dex −1 .Reasons for this are discussed in Section 6.3.Fig. 9 shows the evolution of the fractional contribution of quiescent systems in four 0.5 dex bins at fixed mass and one wider dex bin.The quiescent fraction of low mass systems increases with time monotonically, and at a higher rate of growth than those of the highest mass, at least for z < 2 where comparisons can be made.The quiescent fraction of the most massive systems at any redshift is more affected by both Eddington bias and poisson noise compared to low-mass systems and and so should be taken as an upper limit.Nevertheless, it seems that between 20−30% of M > 10 11 M ⊙ galaxies are quiescent from z ≈ 5 → 1, growing above 50% and plateauing at z < 1.

Inferring the intrinsic mass function
So far we have accounted for three key sources of statistical uncertainty in Section 3.5.However, Eddington bias remains a source of systematic uncertainty that has not yet been removed.
To do so, we fit the observed total, star-forming, and quiescent SMFs with Schechter functions convolved with a kernel which attempts to describe the mass uncertainty for a given mass and redshift interval.This approach is common in the literature (Ilbert et al. 2013;Davidzon et al. 2017;Adams et al. 2021).We adopt a two component parametric kernel of the form introduced by Ilbert et al. (2013): which describes a core Gaussian component with a constant width σ Edd in product with a Lorenztian component which provides wide wings that are crucial to capturing catastrophic errors in SED measurements.The other free parameter τ Edd is redshift dependent such that τ Edd = τ c (1 + z), widening the wings with increasing z.Instead of fitting the kernel directly to the L(M | z) distributions (as in Davidzon et al. 2017 andAdams et al. 2021), we determine the kernel parameters from directly fitting the total SMF leaving the kernel free to vary independently at each redshift.However, both σ Edd and τ c are nuisance parameters and leaving them free to vary is may produce over-fitting of any one SMF at a given z.As expected, the resulting parameter distributions with respect to z contain outliers on either side of the median particularly where the kernel shape is most degenerate with that of the intrinsic SMF; we adopt the median values σ Edd = 0.2 and τ c = 0.1.The resulting kernel is significantly wider than the L(M | z) distributions shown earlier in Fig. 4.This is because the latter assume perfect redshifts and therefore underestimate the full M uncertainty as there is likely to be a covariance with z.In addition, there is a mild evolution of L(M | z) with M, which we have omitted from our kernel form for the sake of simplicity and to avoid overfitting.Interestingly, while σ Edd ≈ 0. At the present, precisely determining the correct, intrinsic shape of the Eddington bias, and its evolution with z as well Age (Gyr) Fig. 9: Evolution of the fraction of quiescent galaxies as a function of redshift in four bins spanning 10 9.5 < M ≤ 10 11.5 M ⊙ .Uncertainty envelopes correspond to 1 and 2σ limits.Points which are mass incomplete are not shown.
as M is problematically difficult.In addition, the results stemming from these kernel-convolved fits suffer a degree of degeneracy with the kernel parameters (fixed or unfixed).We are not alone in issuing this caution; although the SMF measurements of Davidzon et al. (2017) is similar to those of Grazian et al. (2015), their different choice of convolution kernel caused their inferred Schechter parameters to differ.Recently, Adams et al. (2021) explored the impact of the assumed shape of the kernel, as well as various other systematics at z < 2, highlighting the full extent of the problem.Pushing measurements of the SMF to higher redshifts, where fewer constraints are available, naturally increases the influence of the kernel shape, and so the results derived here for α, M * , and Φ * (Tables C.1, C.2, C.3) should be taken in conjunction with our assumed kernel and its evolution with z.
When fitting each sample (total, star-forming, and quiescent) we assume a double Schechter parametric form (Equ. 8) and move to a single Schechter form (Equ. 7) when it achieves a lower χ 2 per degree of freedom.The point at which this change occurs, and when various parameters are fixed, are different for each of the three samples as it depends on not only their shape but also M lim .The fitted points follow from before: we account for minor incompleteness on the low-mass end using the 1/V max correction and include only mass complete M-bins adopting the uncertainty budget from Section 3.5.We proceed in two stages, fitting first using a simple and efficient χ 2 minimization routine (minuit, James & Roos 1975) whose resulting best-fit parameters are used to set the initial positions of walkers in a secondary and longer Markov Chain Monte Carlo (MCMC) optimization (emcee, Foreman-Mackey et al. 2013).It is generally appropriate to initialize walkers at well defined initial positions, assuming each chain is linearly independent (in this case scattering them by 1% of the initial parameter values) and allowed to converge well past its respective autocorrelation length such that it is not sensitive to those initial conditions (Hogg et al. 2010).Flat priors are used for each parameter, with generous limits that are typically not encountered during a given fit.500 MCMC walkers are initialized to seek the state of maximum likelihood derived similarly to the minimum χ 2 , and do so following the "stretch move" (Goodman & Weare 2010) until converged, defined as having every chain iterate for at least 10× their autocorrelation length, and every autocorrelation length having changed by < 1% of their previous value.We do not see any significant differences by using a different move style (e.g., Red-Blue, Metropolis-Hastings), suggesting that the data provide sufficient constraining power.Although we include the original χ 2 results throughout, we focus on the MCMC results which provide full posterior sampling, which are taken from the last 90% of each chain to avoid only the initial burn-in when the parameters are only beginning to converge.The MCMC and χ 2 methods generally find consistent results.Fitting solutions corresponding to both the median posterior and maximum likelihood for the total, star-forming, and quiescent samples is shown in Fig. 10.An alternative version binned in redshift (as opposed to total, star-forming, or quiescent sample) is included for reference in As evidenced by the low χ 2 values, the double Schechter well describes the observed SMF at z ≤ 3.No parameters are fixed other than those of the kernel.While a single Schechter finds a better fit at z > 3, the increasing mass completeness limit weakens the constraints on α 1 .To avoid overfitting, we fix α 1 to the value from the best-fit value at 2.5 < z ≤ 3.0 and do so for the χ 2 and MCMC methods independently but to excellent agreement.All of the unfixed search parameters are generously bounded such that the chains are not likely to encounter them.However, for the sake of physicality especially at highz, we choose to bound the search space for the normalization Φ (z > 0.5) (both Φ 1 and Φ 2 where applicable) to below the upper 68% on the posterior distribution of max (Φ 1 , Φ 2 ) of the previous mass bin.
It is important to note that from χ 2 minimization we obtain a single set of parameters that have minimized the χ 2 as well as their symmetric, Gaussian uncertainties.On the other hand, MCMC provides only posterior distributions that can be used to estimate parameter uncertainties but do not imply a unique definition of "best-fit parameters".Commonly, the median is the best-fit statistic of choice, bounded by a 68% one-parameter confidence interval (which ignores covariance).However, for highly skewed posteriors the median may lie out in the wings and is therefore not a typical value for that parameter.In this case, the most obvious choice may be the parameters corresponding to the model which has found the maximum likelihood.However, MCMC cannot provide an uncertainty on these maximum likelihood parameters, limiting its use.Worse, the model uncertainty cannot be derived from the posterior widths, as they also are covariant and so the resulting error envelopes will be meaningless.The most obvious choice then is to compute the medians and widths of posterior distributions of the chains after burnin.However, the curve traced by the median is not guaranteed to reflect the actual model function, and the 68% confidence envelopes may not contain the maximum likelihood nor the median posterior.We therefore strongly advise that models are reconstructed using the maximum likelihood parameters, and that caution should be exercised when interpreting best-fit parameters from median posteriors.Thankfully, the situation is less severe for symmetric posteriors, which for the total SMF are generally symmetric at z ≲ 6.At 6.5 < z ≤ 7.5, the posteriors are highly skewed as the relatively weak constraints produce widespread parameter covariance.The maximum likelihood results in all cases appear reliable.The evolution of the fitted Schechter models for both the median posterior and maximum likelihood is shown in Fig. 10.In addition to the fits at fixed redshift, we also use the method of Leja et al. (2019a) to fit both the shape and evolution of the SMF simultaneously.This aptly named "continuity model" is fitted on the unbinned distribution of mass-complete sources in z and M8 .Double Schechter parameters M * , Φ 1 , and Φ 2 are treated as continuous functions over time described by second order polynomial expansions in z.The slopes α 1 and α 2 are assumed to be constant.These 11 parameters are joined by a minor parameter σ eff which attempts to capture the cosmic variance.The effects of Eddington bias are incorporated by resampling the entire catalog by 10 random draws from their L(M | z).Only the mass-complete points are fitted, which allows sources near the mass limit to scatter in and out of the SMF realizations.Importantly, this method accounts for the significant covariances in the Schechter function between adjacent redshift bins which is neglected when binning in redshift, and exploits it to provide generally tighter parameter constraints; see Leja et al. for details.We employ emcee to determine the posterior distributions and maximum likelihood fit of the continuity model, noting the aforementioned caveats.Initially we tried to fit the entire SMF out of z = 7.5, but the fits were inaccurate due to the limited range of evolution which can be described by a second order expansion.Already known from Davidzon et al. (2017), Φ rises quickly before slowing down toward z ∼ 0. The second order expansion in z can either fit the quick rise or the slow down, but not both.We therefore only consider galaxies at z ≤ 3 in our continuity model fit, as Leja et al. do, and leave modifications to the fitting functions to future work.The expansion shapes are determined not by their coefficients, but rather by the amplitudes of three anchor points at fixed redshifts.The location of these anchor points is largely arbitrary, and so we follow Leja et al. in choosing z = 0.2, 1.6, and 3.0.Given the larger parameter space, we randomly initialize 100 walkers which explore the space until converged as before.We cannot apply the 1/V max correction directly to the continuity model, and so to remain conservative we exclude sources in the lowest 0.25 dex from the M lim given in Equ. 3 for the total sample.
The evolution of the Schechter parameters inferred from the star-forming and quiescent SMFs are shown in the rightmost panels in Fig. 11 The treatment for the star-forming SMF fit is similar to that of the total form.We begin with a double Schechter form at 0.2 < z ≤ 0.5, and transition to a single Schechter form at 2.5 < z ≤ 5.5 with α 1 fixed.The best-fit model describes the observed SMF reasonably well until z ≈ 3.5 − 5.5 where an excess of high-mass star-forming systems is observed that cannot be described by a single Schechter form.Possible reasons for this are discussed in Section 6.3.
The quiescent SMF behaves noticeably differently from that of the star-forming sample and therefore requires a different strategy.We begin with a double Schechter form at 0.2 < z ≤ 1.5 Evolution of the best-fit Schechter parameters computed from fits to the star-forming (blue) and quiescent (orange) SMFs from the χ 2 regression (unfilled colored diamonds) and likelihood methods using fixed redshifts bins (median posterior as filled points with ±34% envelope, maximum likelihood as unfilled squares).
but fix α 2 at 1.1 < z ≤ 1.5 as the constraints on the low-mass regime deteriorate.With no significant low-mass constraints, we transition to a single Schechter at 1.5 < z ≤ 5.5.We note that this disregards a possible low-mass contribution and so the extrapolation to low-masses is likely an underestimate at z > 1.5, depending on the growth of low-mass quiescent galaxies beyond our mass limit (see Santini et al. 2022).The normalization of the quiescent SMF continues to steadily decline with z.Given the uncertainty about sample completeness at 4.5 < z ≤ 5.5, we necessarily fix α 1 to ensure a reliable fit of the secure measurements at high-M.Consequently, we find a mild rise in the expected M * from z ≈ 3 → 5.
The evolution of the inferred total, star-forming, and quiescent SMFs resulting from the fixed redshift and continuity model fits are shown in Fig. 10.Furthermore, the evolution of the Schechter function parameters inferred from each sample are shown in Fig. 11.Although the evolutionary physics inferred from the fitting will be discussed in Section 6.1, we briefly describe the immediate result here.In general, parameters derived at fixed redshift from the χ 2 , maximum likelihood, and median posterior agree well for the total, star-forming, and quiescent SMFs, with few exceptions.
The characteristic mass is found to be M * ≈ 10 10.7−11.0M ⊙ with very little significant evolution until z ≈ 3 when it begins to decrease with increasing z, with the continuity model suggesting a potentially increasing value with z.This contrasts with results from Davidzon et al. (2017), who find an initial decrease in M * which increases at z > 3. The results of the continuity model of Leja et al. (2019a) agree well with the lack of evolution suggested by our measurements, although in some tension with our continuity model results.M * of the star-forming sample similar in shape but generally larger than that of the quiescent sample.The likeness to the total M * at fixed redshift is proportional to the sub sample that dominates around the knee in that z interval.
The low-mass slope of the low-mass Schechter component α 1 is roughly constant with time, although may experience a maximum at z ≈ 1, in agreement with Davidzon et al..However, whereas Davidzon et al. finds a steeply decreasing α 1 at 2 < z ≤ 5, we find a constant evolution which we then fix at z > 2.5 to avoid mounting degeneracies as the low-M constraints are lost; this is also where a single Schechter becomes the preferred form.The value of α 1 in good agreement with that found from our continuity model fit, as well as that of Leja et al. who formed a composite SMF with significantly lower-M and hence α 1 constraints.While α 1 slope of the star-forming sample is similar to that of the total sample, the α 1 slope of the quiescent sample is poorly constrained and appears to decrease with redshift (i.e.steepen with time), consistent with the rapid appearance of low-M quiescent systems at late times.
The normalization of the low-mass Schechter component Φ 1 has little evolution at 0.2 < z ≤ 1.0, afterwards rapidly decreasing until it appears to approach zero asymptotically.Although Davidzon et al. finds slightly larger values of Φ 1 at low-z, both measurements generally agree on the rapid decline of the lowmass normalization.Φ 1 derived from the continuity model finds still lower values and hence a more gradual evolution 9 .Yet, Φ 1 derived by Leja et al. agrees well with our fixed redshift measurements.As with α 1 , the Φ 1 normalization of the star-forming sample is similar to that of the total sample and Φ 1 of the quiescent sample remains significantly smaller and decreases with redshift.
The low-mass slope of the high-mass Schechter component α 2 is statistically consistent with being constant with z, but may rise slightly toward z ≈ 2. This is broadly comparable with Davidzon et al. within the stated uncertainties.α 2 derived from our continuity model agrees well with the fixed redshift measurements, and with that of Leja et al.. Interestingly, at z ≲ 3 we find that α 2 -α 1 ≈ 1, which is predicted by Peng et al. (2010) as an indicator of mass quenching.The α 2 slope of the star-forming sample is similar to that of the total sample, with α 2 of the quiescent sample changing from negative at z ≲ 1 to positive at 1 ≲ z ≲ 2 where it appears to stabilize.
Lastly, the normalization of the high-mass Schechter component Φ 2 generally declines over z = 0.2 → 3.0.Interestingly, the continuity model finds an initial increase in Φ 2 toward z ≈ 1, declining afterwards in agreement with the fixed redshift estimates.We find lower values compared to those of Davidzon et al. and Leja et al., which are in general agreement in part because they both use COSMOS2015.This suggests that the different evolution in Φ 2 between this work and the literature may be driven by subtle differences between the catalogs, sample selection, assumed Eddington kernels, or a combination of the three.The Φ 2 normalization of the star-forming sample is subdominant to that of the quiescent sample with both decreasing steadily with time, consistent with the large fraction of quiescence found in massive systems out to z ≈ 2.5.
Overall we find good agreement with the most directly comparable literature measurements from Davidzon et al. (2017) and Leja et al. (2019a).The implications for these evolutionary trends, and their relation to the growth of massive quiescent galaxies, are explored in detail in Sections 6.3 and 6.4.

Discussion
In this section we focus on some of the open issues in galaxy evolution that can be addressed from a statistical perspective, using the results of Section 5 as a starting point.Besides the assembly of z ≈ 3 − 5 massive galaxies and their quiescent fractions, we also discuss derived measurements such as the cosmic stellar mass density ρ * , as well as the relation between stellar and dark matter halo mass functions.We include comparisons with simulations to provide physical insight and also to highlight areas for improvement from both sides.

Cosmic stellar mass density evolution
Galaxy mass assembly and growth is inextricably related to star formation.As such, reconciling direct measurements of galaxy mass growth with the behavior of the star formation main sequence (e.g., Brinchmann et al. 2004;Noeske et al. 2007;Daddi et al. 2007;Salim et al. 2007;Whitaker et al. 2012Whitaker et al. , 2014) ) is of great interest.Meaningful assessment of empirical models of galaxy growth that link the two (e.g., Peng et al. 2010;Behroozi et al. 2019) depends on unbiased, accurate measurements of the integrated mass density ρ * and its evolution with z.
We integrate the SMF measurements presented in Section 5 to derive an estimate of the stellar mass density (ρ * ) for each bin of z.Although definitions vary, ρ * is commonly integrated down to 10 8 M ⊙ .Since our M lim at all redshifts is larger than 10 8 M ⊙ , we integrate into the extrapolated low-M regime of our (unconvolved) Schechter models.The resulting ρ * for the total sample in Fig. 10 is compared with other literature measurements in Fig. 12, converting to a Chabrier IMF where relevant.All SMF-based literature measurements of ρ * have been reintegrated consistently to the same mass limit.We also show ρ * derived from integrating the star formation rate density (SFRD) function of Madau & Dickinson (2014), assuming a 41% return fraction corresponding to Chabrier's IMF (see Section 6.1 of Ilbert et al. 2010).
We find remarkably good agreement with previous observational studies.At z ≲ 3, the agreement seems to be limited by systematics as these are generally well measured, secure samples.However, ρ * at z ≳ 3 is dominated by significantly less certain measurements, due both to the size of the samples and their typically noisy photometry.Our measurements place ρ * near the midpoint of the scatter, in closest agreement with Davidzon et al. (2017, z ≤ 5) and Grazian et al. (2015, z ≤ 7).The relatively large uncertainties on ρ * beyond z ≈ 4 stem from the increasing degeneracy of M * and ϕ with decreasing low-M constraints (α being fixed at z > 1.5; see Fig . C.1).
At z ≳ 5, our estimated median ρ * as well as that of Grazian et al. is lower than the SFRD-derived predictions of Madau & Dickinson (2014).This discrepancy could suggest that while star formation is high, the mass growth is lagging behind.As intriguing as this is, we stress that the measurements are consistent within 1σ, that the SFRD constructed in Madau & Dickinson are constrained by only two surveys at z > 5 available at the time (Bouwens et al. 2012a,b;Bowler et al. 2012), and that our measurements assume a fixed low-mass slope α.Further work utilizing larger samples complete to lower masses will be required to confidently evaluate this divergence at z > 7. Age (Gyr) Fig. 12: The 0.2 < z ≤ 7.5 cosmic stellar mass density.Upper: Evolution of the cosmic stellar mass density of the total sample computed from the best-fit likelihood models (blue) integrated above 10 8 M ⊙ .Literature results from observational studies of mass-selected samples (Grazian et al. 2015;Tomczak et al. 2014;Caputi et al. 2015Caputi et al. , 2011;;Ilbert et al. 2013;Muzzin et al. 2013;Santini et al. 2012;Adams et al. 2021;McLeod et al. 2021;Wright et al. 2018;Thorne et al. 2021) and mass inferred from rest UV measurements (González et al. 2011).By integrating their SFRD functions, we can plot ρ * from Behroozi et al. (2013) and Madau & Dickinson (2014).In both cases we assume a return fraction of 41% (based on Chabrier's IMF, see Section 6.1 of Ilbert et al. 2010).For Madau & Dickinson (2014), we include a shaded area based on return fractions between 25-50% (the latter value is similar to the one given by Salpeter's IMF).Lower: Evolution of the cosmic stellar mass density of the total (gray, repeated from above), star-forming (light blue), and quiescent (orange) samples compared to literature measurements (Behroozi et al. 2019;Santini et al. 2021;McLeod et al. 2021).
In addition to ρ * of the total sample, the lower panel of Fig. 12 includes ρ * for the star-forming and quiescent samples.While stellar mass is overwhelmingly concentrated in starforming systems from z ≈ 7 → 3, the mass density of quiescent systems grows rapidly until flattening at z ≈ 1, consistent with Fig. 8. From z = 1 → 0, there is no growth in ρ * for either starforming and quiescent systems where the former remains larger than the latter.
While ρ * of the star-forming sample is well constrained out to lower masses, the same cannot be said about ρ * of the quiescent sample as its SMF is less mass complete in comparison.Since we cannot directly determine its low-M shape, we assume that the low-M Schechter component effectively vanishes at z ≳ 1.5.If this is not the case, then we underestimate ρ * for the quiescent sample.This may help explain the differences observed with respect to Universe Machine from Behroozi et al. (2019) who report generally larger values of ρ * for their quiescent sample, which features a significant quiescent population at low M (see Fig. B.1).Although in agreement at 0.2 < z ≤ 0.5, Behroozi et al. report higher quiescent fractions at < 1010.7 M ⊙ with redshift, finding more than 10× as many 10 9 M ⊙ by z ≈ 1. Universe Machine primarily defines quiescence as sSFR< 10 −11 yr −1 , which according to Davidzon et al. (2018) is comparable our NUVrJ-selection, and so this overproduction of quiescent galaxies may indeed be at odds with our observations.Although not readily available, a consistently NUVrJ-selected sample from Universe Machine would clarify this sensitive comparison.Comparisons with the results of Santini et al. ( 2021 2021) adopts a UV J selection following Carnall et al. (2018a).Therefore, we stress that these estimates of ρ * for our quiescent sample are particular to NUVrJ-selected quiescent systems, and our assumption of a single Schechter description at z > 1.5 may drive our relatively low estimates.Finally, despite differences in selection and constraining power, these observational studies collectively indicate a general agreement as to the evolution of ρ * for quiescent systems.

Comparison to simulations
Observational constraints on the shape and evolution of the SMF may be useful in their own right, but we can also gain meaningful insight by comparing them with SMFs produced by galaxy formation simulations.Fig. 13 shows the SMF constraints from this work and the inferred SMF derived by the binned MCMC fits evaluated at maximum likelihood, with the flagship or reference versions of simulated hydrodynamical SMFs overlaid: TNG100 of the IllustrisTNG project 10 , Eagle, Simba and Flares (Pillepich et al. 2018;Furlong et al. 2015;Davé et al. 2019;Lovell et al. 2021, respectively).It is worth noting that Flares uses the same code and model as Eagle (specifically they adopted the strong AGN feedback Eagle model introduced in Crain et al. 2015), but sample regions that span a wider dynamical range, and therefore has an much larger effective volume than Eagle.We also include the semi-analytical results from Shark (Lagos et al. 2018) as well as those of Santa Cruz (Yung et al. 2019a,b, see also Somerville et al. 2015).To note, we do not apply any artificial normalization (aside from h considerations) to any model or observation such that direct comparisons are possible from the figures alone.
Since our observed SMF measurements are affected by Eddington bias, it is preferable to compare the SMF directly predicted by the simulations assuming no errors to our inferred SMF To note, we do not apply any artificial normalization to any model or observation.Upper limits for empty bins are shown by the horizontal gray line with an arrow.Mass incomplete measurements are not shown.
with considerations as to the assumptions made in our Eddington bias correction.We acknowledge that each simulation has multiple flavors with variations to their physical recipes.Although these variations provide additional insight, care must be taken when making these more complicated comparisons.As the goal of this work is to provide constraints on the observed and inferred SMF, we reserve comparisons to the variations of simulated SMFs for future work.
Overall, Flares, Shark, and Eagle perform best below M * , with Simba, Flares, and the Santa Cruz SAM performing best above M * .At high-z (z ≳ 4), we find the best agreement with Simba, Illustris TNG100, and the Santa Cruz SAM.At lowz (z ≲ 1.5), however, there is a slight preference toward Simba as Illustris TNG100 underestimates the number densities at M * for z ≤ 1.5.The situation is different at 1.5 < z ≤ 3.0 where Simba significantly overestimates the high-M end.While this could be explained by a systematic bias in observed masses or a missing high-M population (see Section 6.3), it could also be due to Simba potentially over-grouping several separate dark matter haloes into one massive halo and/or insufficient AGN feedback at early times.Over this same range, it is apparent that Eagle suffers from volume limitations and thereby does not contain the most massive galaxies.This is most clear from the comparison between Eagle and Flares, which are based on the same physical recipes, but the latter sample much rarer overdensities, and hence captures the high-M population.Both the Shark and Santar Cruz SAMs fare better as their semi-analytical prescriptions are able to produce high-M systems.In some z-bins, both Shark and Eagle assemble low-M systems too early relative to the observed SMF where there are direct constraints (most apparent at 1 ≲ z ≲ 2).We find a lesser degree of agreement at z > 3.5: while both Shark and Eagle underproduce the number of galaxies at all M, there is significant scatter between Illustris TNG, Simba, Flares, and Santa Cruz.Meanwhile the volume limitations of Illustris TNG100 become significant at z > 6 (see Pillepich et al. 2018 for comparisons with TNG300).
It is remarkable that all of the simulations 11 reproduce the 0.2 < z < 0.5 SMF, and yet at higher redhshifts display severe disagreement with our observations and each other with number densities differing by more than a factor of ten.While this should not come as a surprise given that simulations are typically tuned to reach an end state in agreement with the local Universe (despite known disagreements with the observed sSFR history and merger rates, see Popesso et al. 2023, Conselice et al. 2022), it suggests that their initial conditions and early evolutionary behavior are strikingly different such that variously tuned physical recipes and initial conditions can produce the same end state.This highlights the need for continued observations to improve constraints on the SMF at high-z where simulations can be critically tested, and their physical thereby recipes refined.In addition to the treatment of AGN and associated outflows (e.g., Debuhr et al. 2012;Richardson et al. 2016;Mitchell et al. 2020), modeling of dust attenuation at high-z is a particularly relevant concern at z ≳ 4 where the dust content of galaxies has been largely unknown, and in turn makes stellar mass estimates based on rest-frame UV/optical light increasingly uncertain.This, coupled with uncertainties in dust production mechanisms at high-z, can lead to simulations diverging significantly (from each other and observations), and has often been cited as a major contributor to such discrepancies (see discussions in e.g., Kokorev et al. 2021).Although future observations will benefit from improved more comprehensive mass uncertainties, our current constraints indicate that while it appears that simulations are able to reproduce the observed abundance of high-M galaxies and so the production of early low-M systems needs improvement.
6.3.Abundant massive galaxies at z ∼ 3 − 5 One of the most striking results highlighted in Fig. 6 is the high number density of massive M > 10 11 M ⊙ galaxies not only at z > 3.5.Although few in number, their identification in COSMOS2020 is a direct consequence of utilizing the larger 1.27 deg 2 now accessible with deep, homogeneous NIR coverage.While no M > 10 11 M ⊙ galaxies are observed at z > 5, their growth since then has been similar to galaxies at other masses, as shown in Fig. 8.The majority of M > 10 11 M ⊙ systems are starforming at z > 1, as shown by Fig. 7 and quantified in Fig. 9, with only z < 1 systems at M ≈ 10 10.5−11 M ⊙ being equally divided between star-forming and quiescent states.We find evidence for a sustained population of massive quiescent systems at z > 2, but their number densities are dwarfed by that of star-forming systems by a factor of ∼ 10.The existence of these massive quiescent systems seems to defy the timescales expected for mass assembly (Steinhardt et al. 2016;Faisst et al. 2017;Schreiber et al. 2018b;Cecchi et al. 2019), and so their tremendously early formation and growth are a topic of great interest (Caputi et al. 2011;Toft et al. 2017;Hill et al. 2017;Carnall et al. 2018b;Tanaka et al. 2019;Valentino et al. 2020;Whitaker et al. 2021;Akhshik et al. 2023;Marsan et al. 2022), including a more focused investigation into the origins of similarly selected massive galaxies from COSMOS2020 by Gould et al. (2023).
Despite nearly identical sample selections, we still find greater number densities of massive galaxies compared to Davidzon et al. (2017) at 3 < z ≤ 5.5.By comparing to COS-MOS2015 (matched within 0.6 ′′ ), we find that 78% of M > 10 11 M ⊙ galaxies in our sample are also found in COSMOS2015 where they are exclusively high-z (91% agree within ∆z ± 0.5) and high-mass (78% agree within ∆M±0.5 M ⊙ ).The remaining 23% are new sources found only in COSMOS2020.As shown by Fig. 14, they are predominantly faint, having a median K s magnitude ≈ 24.2 AB compared to the median sample brightness K s ≈ 23.4 AB with a photometric uncertainty 0.05 − 0.1 AB (see Fig. 10 of Weaver et al. 2022).They are generally as massive as the already known sources from COSMOS2015 and follow the same M distribution.They are also remarkably red, having a median H − K s color of ≈ 1.2 AB compared to that of the sample overall (≈ 0.8 AB).It seems unlikely that they are now found because of the better de-blending by The Farmer as visual inspection shows that they are generally isolated sources and are detected by Source Extractor in the Classic version of the catalog.Therefore the most probable explanation for the new faint, red sources is that the deeper UltraVISTA Y JHK s (as well as HSC iz) have enabled the detection of these faint red sources which in COSMOS2015 could not be detected.Interestingly, we find that their distribution in stellar mass is consistent with that of the total sample such that number densities at all masses M > 10 11 M ⊙ are proportionately represented.
Visual inspection of photometry and SED fits indicate that nearly all of the 3 < z ≤ 5.5 M > 10 11 M ⊙ galaxies have red colors.About 75% of them are selected as star-forming by NUVrJ, ≳ 80% of which are likely attenuated by a thick screen of dust (A V > 1; compared to only ∼ 10% across entire sample).The remaining ∼ 25% are classified as quiescent.The red colors appear to be genuine and not driven by blends, as confirmed by visually inspecting the imaging for each galaxy.Their broad-band photometry lacks the strong spectral features that contribute to a secure photo-z: there is no detectable flux contamination by nebular emission lines and both the Lyman and Balmer breaks are weak.However, they are surprisingly well fit by LePhare (typical χ 2 N ≈ 1.5).Their likelihood redshift distributions L(z) are narrow as > 68% of the probability is typically contained within ∆ z ≈ 1, with similarly well constrained L(M | z).Re-cently, Lower et al. (2020) has shown how the relatively simple parametric star formation histories assumed by most templatebased SED fitting codes are susceptible to biases on the order of 0.5 dex in mass, which suggests that these uncertainties are likely underestimated (see also Michałowski et al. 2012Michałowski et al. , 2014)).
One possible explanation for smooth SEDs with a powerlaw slope is contamination by AGN.The COSMOS2020 results computed by LePhare include classifications for sources with strong X-ray detections12 determined from crossmatching to Civano et al. (2016) sources within 0.6 ′′ radius.In our sample we only use those sources identified as galaxies, which excludes these X-ray detections.Their inclusion would only serve to increase these already surprisingly large number densities.We do not attempt to quantify this, as the expected accretion disc light from Type 1 Seyferts and quasars make estimates of photoz and M unreliable and susceptible to catastrophic failures.See Weaver et al. (2022) and Salvato et al. (2011) for details.
Identification of AGN (including X-ray faint AGN) from broadband colors has been explored in the literature (Stern et al. 2005;Donley et al. 2012;Hviding et al. 2022), and their discussion is a standard component of SMF studies at these redshifts (see Grazian et al. 2015;Davidzon et al. 2017), although a consensus has yet to be reached.In general, AGN selection criteria rely on colors derived from (near-)infrared wavelengths from most notably Spitzer/IRAC.While the Donley et al. criteria have been used successfully at z < 3, they require constraints in all four IRAC bands, which is not the case for COSMOS2020 where channel 3 and channel 4 lack sufficient depth to detect these sources.Even if deeper IRAC images could be taken, the selection criteria rely on the four bands sampling the continuum shape at restframe 2 − 10 µm but at z > 3 instead sample the restframe stellar bulk at ≲ 2 µm.While the MIPS 24 µm data from S-COSMOS (Sanders et al. 2007) is an attractive solution, it also is too shallow (20.2 AB at 3σ, Jin et al. 2018) to fully constrain the restframe 2 − 10 µm continuum at 3 < z ≤ 5: only galaxies H ≲ 20 with large AGN fractions can be positively identified, and both low-fraction AGN at H ≈ 20 and normal galaxies at H < 20 cannot be classified with MIPS.Full spectral fitting including far-infrared measurements is challenging without the constraints from channel 3, channel 4, and 24 µm, and attempts to gain further insight using Stardust (Kokorev et al. 2021) were broadly unsuccessful with tentative contamination found to be on the order of 10 − 30%.We note, importantly, that removing potentially contaminated sources provides no significant change to the SMF at these epochs.
Despite these challenges, we are able to leverage the elementary AGN template fitting from LePhare to statistically assess the spectral similarity between the best-fit AGN and galaxy templates for each source, limited to rest-frame UV/optical light.While only 10% of galaxies in the total sample are best-fit with AGN templates, this fraction grows to 30% for all mass complete galaxies 3.5 < z < 5.5 and then to 50% for those with M > 10 10.8 M ⊙ with broad wings, having > 80% of these sources statistically indistinguishable as either galaxy or AGN (|∆χ 2 | < 0.5).Turning to further infrared data of individual sources, we find 15% of the sample are detected by VLA-COSMOS (Smolčić et al. 2017).While not conclusive, we find 5% of these massive M > 10 11 M ⊙ galaxies at 3 < z ≤ 5 are detected in the ALMA maps of A 3 COSMOS (Liu et al. 2019), which currently covers ∼5% of the COSMOS field, highlighting the need for further observations.While similar sample statistics for individual sources have been extrapolated in the literature (e.g., Santini et al. 2021), anticipated surveys such as the TolTEC Ultra-Deep Galaxy Survey (Pope et al. 2019) will expand and deepen the far-infrared/(sub)mm coverage in COSMOS so that these samples can be more conclusively investigated.
A recent X-ray and radio stacking analysis of similarly selected z > 1.5, M > 10 10 M ⊙ COSMOS2020 galaxies by Ito et al. (2022) (selected from independent SED fitting with MIZUKI, Tanaka 2015) revealed that low-luminosity AGN are likely ubiquitous in massive quiescent galaxies from 1.5 < z ≤ 5, even if individually they are not X-ray or radio detected.They also estimate the AGN contribution to optical/NIR continuum and find that at these redshifts, e.g., the rest-frame B-band luminosities of their quiescent galaxies are a factor of 30× larger than the expected rest-frame AGN luminosity.Ito et al. also find that the AGN luminosities for quiescent galaxies are significantly larger than those of star-forming galaxies.Together, their findings provide further confidence that our redshifts and stellar mass estimates for our X-ray undetected sample are not strongly contaminated by AGN light.
The lack of obvious systematics or likely only weak AGN contamination increases our confidence that these sources, or at least a part of them, are truly massive at z ≳ 3.In agreement with Ito et al. (2022), we also find that > 60% of M > 10 11 M ⊙ galaxies appear to be star-forming, and it is likely that at least some of them are also dust obscured (DSFGs; Casey et al. 2014;Zavala et al. 2021, and references therein).Since there are a number of sources found in the deeper NIR images of COSMOS2020, it makes sense that we are now more sensitive to fainter red sources than ever before (see Fig. 14).It is precisely this class of galaxy which are efficiently captured by infrared facilities such as Spitzer, ALMA, and JWST until now being optically "dark" (e.g., Schreiber et al. 2018c;Wang et al. 2019;Gruppioni et al. 2020;Sun et al. 2021;Fudamoto et al. 2021;Shu et al. 2022).If genuine, their existence not only points at an incompleteness particularly in the massive end of SMFs reported in previous studies lacking the necessarily deep infrared data over degree scales, but also highlights the sensitive interplay between the shape of their SEDs and the selection function consisting of the bands, their depths, and the detection methodology (see Fig. 3 of Weaver et al. 2022).Although detailed simulations will be explored in future work, qualitatively this could explain why the excess of sources relative to a Schechter at z ≈ 3 − 5 diminishes at z ≈ 6 as similarly red sources become too faint to still be detected.Optically dark galaxies selected from 2 mm ALMA observations in COSMOS from Casey et al. (2021) and Manning et al. (2022) are constrained to similar redshifts, stellar masses, and number densities.Importantly, Manning et al. studied two systems with star formation rates above 200 M ⊙ yr −1 but a gas depletion timescale < 1 Gyr, suggesting rapid star formation cessation and a potential transition to massive quiescent galaxies by z ∼ 3 − 4. Casey et al. (2021) find that DSFGs are responsible for ∼ 30% of the integrated star formation rate density at 3 < z < 6 and that 2 mm selection is an efficient way to identify larger samples in future surveys (see also Cooper et al. 2022).In an empirically based numerical model, Long et al. (2022) uses known DSFG population properties to predict the DSFG stellar mass function out to z ∼ 5.They find a shallow power law extension beyond the knee of traditional star-forming Schechter SMFs, and posit that 60-80% of massive (M > 10 11 M ⊙ ) star-forming galaxies at z ≈ 3 − 6 are significantly dust-obscured and not captured in previous SMF studies (see also Martis et al. 2016).This is in line with our aforementioned results on significantly attenuated NU-VrJ-selected massive star-forming galaxies at similar redshifts.
As introduced in Section 6.2 and shown in Fig. 13, there is generally good agreement between number densities of massive galaxies found in several simulations and in this work.This may be surprising given the range of physical recipes utilized in these simulations, and may suggest that several different tuning of physical recipes (mainly those of radiative and mechanical AGN feedback) can reproduce observations.However, from 2.5 < z ≤ 5.5, we observe galaxies with mass 1.8× larger than the most massive galaxies in any of these simulations.While bias and underestimated mass uncertainties may contribute, the simulations are usually considered to be limited by their volume.Fig. 15 compares the observed number densities of massive galaxies (including quiescent) to the upper limits set by volumes of different surveys as well as simulations.The fully hydrodynamical codes (Eagle, TNG100, and Simba) have the smallest volumes comparable to CANDELS and are similar to the observed number densities of M > 10 11 M ⊙ galaxies.However, the fact that they contain a large enough volume to catch the most massive galaxy seen in our observations may suggest that their volumes are adequate, but that their DM halo growth physics are not providing the large halos from which they can grow.Alternatively, recent observations of a highly star-forming galaxy at z = 6.9 (SFR≈ 2900 M ⊙ yr −1 , Marrone et al. 2018) place close to a maximally massive DM halo not seen in current simulations -populations expected to evolve into the most massive systems at z ≈ 2 − 4 (Lower et al. 2022).This is especially true of Shark, whose volume is similar to that of COSMOS itself.More massive galaxies, if they exist, are not likely to be found in COS-MOS, and so larger volumes (such as that probed by the two Euclid Deep Fields North and Fornax, as well as Roman) will be required, see McPartland et al. in preparation).
Naturally, massive galaxies in the z > 3 Universe are of great interest as targets for JWST.The widest deep-field ERS program is the 100 arcmin 2 Cosmic Evolution Early Release Science Survey (CEERS; Finkelstein et al. 2017), and based on our estimates it stands to find approximately 22, 16, 5 for M > 10 10.5 M ⊙ , and 6, 4, 2 for > 10 11.0 M ⊙ ) galaxies at 3 < z ≤ 3.5, 3.5 < z ≤ 4.5, and 4.5 < z ≤ 5.5, respectively, although the small area will equate to a stronger density bias from cosmic variance.The widest galaxy survey field of any GO program is the 0.6 deg 2 COSMOS-Web (Kartaltepe et al. 2021), which is expected to find 493 (137), 340, (94), 103 (40) galaxies for the same redshift ranges (and masses).Although JWST will be instrumental in studying these sources in exquisite spatial resolution and with efficient spectroscopy (Barrufet et al. 2021;Carnall et al. 2021;Glazebrook et al. 2021), ground-based NIR observations that can efficiently identify them in wide-area surveys will retain their importance.

Rise of quiescent galaxies
As shown by Fig. 7, low-z quiescent galaxies are well described by a two component Schechter function whose low-M component diminishes rapidly from z ≈ 0.2 → 1 (see Fig. 11).Simultaneously our sample becomes less mass complete with redshift, doing so more rapidly for quiescent systems due to their red color characteristic of their high mass-to-light ratios.Despite the considerable uncertainties on the completeness of our low-M quiescent sample outlined in Section 3.3, it seems likely that the shape of the quiescent stellar mass function in this work and in previous literature in fact does turn down at low-masses, with selection effects playing a comparably minor role (see also Ilbert et al. 2010).However, there are hints of M ≲ 10 9 M ⊙ quiescent systems at z ≳ 1.5 (beyond our mass limit) reported by Santini Given their low apparent brightness and rarity, the present work is the first to consistently quantify the number densities of M ≈ 10 9.5 M ⊙ low-mass quiescent systems at 1.5 < z ≤ 4.5 at high signal-to-noise, based on the deepest NIR data taken over such a homogeneously measured degree-scale area required to find them in sufficient numbers.As seen in Fig. 8, the rate of growth in the number density of low-mass (M ≲ 10 10 M ⊙ ) quiescent galaxies has been seemingly rapid over the past ∼ 10 billion years.Still, they constitute only a minor fraction of lowmass sources by z ∼ 0.2, and by extrapolating the number densities from z ≈ 1 − 2 one may expect to find none within COS-MOS by z ∼ 3 (see Fig. 8).This is typically interpreted by the phenomenological model of Peng et al. (2010) to mean that the processes which act to halt star formation cessation in low-M systems are inefficient at early times.For example the lack of virialized at z > 2 − 3 structure makes influence from environmental effects unlikely.
As shown by Fig. 8, the apparent lack of growth in the abundance of massive systems at z < 1 is the result of a decline in star-forming galaxies simultaneous with an increase in quiescent number densities.As noted by Ilbert et al. (2013), this decrease in the star-forming population is consistent with star formation cessation becoming extremely efficient, to the extent that massive star-forming galaxies are becoming quiescent faster than they can be replaced.Therefore, the mass assembly of massive star-forming systems at z < 1 is slower than the cessation of their star formation.However, there is also a slowing down in the rate of growth in the number density of massive quiescent sys-tems themselves.While this may suggest high incidences of dry mergers or even rejuvination, it must also relate to evolving demographics: from z ≈ 1.5 → 0.3 there are fewer and fewer high-M star-forming galaxies available to become quiescent.While the number density growth of massive systems seems to have stalled out, that of lower-mass systems continue to grow; this is the so-called phenomenon of "downsizing with time" (Cowie et al. 1996;Neistein et al. 2006;Fontanot et al. 2009).
The quiescent mass function at M > 10 11 M ⊙ does not change much from z ≈ 4.5 → 2.0, with surprisingly elevated quiescent fractions (Fig. 9) being above 20% at z < 5. Fig. 8 shows why: while the number density of star-forming galaxies at z ≈ 5 is lower than at z ≈ 2, quiescent galaxies are roughly constant in density over this same range.However, we caution that their number densities are only marginally above what should be possible to find within the nominal volume of COSMOS at these redshifts, and so it is plausible that they are dominated by noise and/or photo-z bias.The question of whether or not this behavior is genuine can only be explored in future large volume surveys, as demonstrated by Fig. 15.Even though the most massive galaxies at z ≳ 3 − 4 are typically too faint to obtain continuum features, spectroscopic follow-up and supporting SED fitting will continue to provide valuable insights (Gobat et al. 2012;Schreiber et al. 2018a;Valentino et al. 2020;Glazebrook et al. 2017).
Fig. 16 shows the fraction of quiescent galaxies in bins of mass for three epochs: z ≈ 0.3, 1.3, and 2.7.A key advantage of examining fractions is that they are less sensitive to the overall normalization of the simulation and biases from observations (e.g., in simulations: overproduction of all galaxy masses; in observations: systematics in effective survey volume), and provide additional insight which is obscured by simply comparing mass functions.Although comparisons to Universe Machine have been already discussed in Section 6.2, we introduce two samples of quiescent galaxies which we selected from Eagle and Shark with an NUVrJ selection consistent with our methodology.While Eagle underproduces quiescent systems by 10 − 20% at all masses, Shark overproduces them in all but the most massive bins at z ≈ 0.3 − 1.3 but underproduces them at z ≈ 2.7.Roughly, the fraction of quiescent galaxies in Shark at z ≈ 1.3 matches the observed fractions at z ≈ 0.3.This may suggest that Shark's physical recipes that halt star formation in lower mass galaxies are too aggressive.These same systems are also seemingly overproduced (see Fig. 13), and so may be assembling and maturing too early.For additional figures and details, see Appendix B.

Dark matter halo connection
The mass assembly of a galaxy is inherently connected to the dark matter halo in which it formed and grew (see Wechsler & Tinker 2018 for a review).Yet, stellar masses M have been observed to be < 2% of their halo mass M h , which point to galaxy formation as a strikingly inefficient process (Mandelbaum et al. 2006;Conroy & Wechsler 2009;Behroozi et al. 2010).This has led to investigations into the role of dark matter halo mass in influencing star formation cessation, including promoting thermal heating and/or gas expulsion by AGN (Shankar et al. 2006;Main et al. 2017), as well as virial shock heating of in-falling cold gas whose Jeans mass inhibits the formation of star-forming molecular clouds (so-called hot-halo mode, Birnboim & Dekel 2003).Hence, the gravitational influence of the halo mass on the cold gas reservoir regulates the ability of a galaxy to form stars, and hence the stellar-to-halo mass relation-  Lilly et al. 2013).While in this work we restrict ourselves to comparisons with theoretical HMFs, another paper in this series computes a self consistent SHMR based on measuring the halo occupation distribution directly from angular correlation functions and SMFs of COS-MOS2020 galaxies (Shuntov et al. 2022).For an investigation into the SHMR split by star-forming and quiescent samples, see Cowley et al. (2019), which is also based on galaxies from the COSMOS field.
As shown in Fig. 17, we compare our observed and inferred SMFs to the halo mass function (HMF) of Tinker et al. (2008) 13 from z = 1.5 → 7.5.We choose not to show z < 1.5 as these  (Behroozi et al. 2013) to produce an idealized SMF (gray curve).Upper limits on the SMF at each redshift are derived from the HMF assuming a fixed baryon fraction (0.166).Upper limits for empty bins are shown by the horizontal gray line with an arrow.Mass incomplete measurements are not shown.
comparisons have been thoroughly explored by previous investigations (e.g., Davidzon et al. 2017;Legrand et al. 2019) and we do not observe any significant differences.The effects of feedback can be seen in the first panel of Fig. 17 at 1.5 < z ≤ 2.0 that explains the relatively lower number densities of both low-and high-M systems, with those around M * being most similar to HMF.This apparent tension is a well known feature and lies at the foundation of the contemporary galaxy evolution paradigm, involving halting star formation by secular (internal) and/or environmental (external) action on the gas reservoir such as thermal heating, dynamical turbulence, and/or removal.The growth of low-mass M < M * galaxies can be impeded by secular (e.g., supernovae and stellar winds) and environmental processes (e.g., ram-pressure stripping, thermal evaporation).Similarly, secular and environmental processes can also impede the growth of massive M > M * galaxies, albeit from different driving forces such as AGN and major mergers, respectively (see Peng et al. 2010Peng et al. , 2012Peng et al. , 2015;;Wechsler & Tinker 2018;Förster Schreiber & Wuyts 2020).In this context, the characteristic knee at M * is the result of a build-up of massive galaxies which can no longer sustain mass growth.At z ≈ 3, the low-mass end is still considerably lower than the scaled HMF but the number density of massive systems comes into agreement.Although the stellar mass function slightly lies above the SHMR-scaled HMF at some points, we caution that this should not be taken as a challenge to theory as it assumes that the SHMR at z = 0 is appropriate at higher-z (which is unlikely; see Legrand et al. 2019) and small modifications can reconcile this difference.We note that Fig. 18 of Davidzon et al. finds no such offset using the same scaling, but we are unable to reproduce their precise result.
We derive an upper limit for the baryonic matter distribution by rescaling the HMF by Ω b /Ω m = 0.166, which for our adopted cosmology is redshift independent, and assume a 100% efficient baryon-to-stellar mass conversion.This is the maximum SMF physically allowed under our simple assumptions.This upper limit becomes relevant especially at z > 3.5 where our observed number densities exceed those inferred by the Schechter model.While a large Eddington bias or selection systematic could explain this excess (see Section 6.3), we stress that our inferred SMF assumes the applicability of Schechter formalism, which cannot accurately describe the observed number densities at 3.5 < z ≤ 5.5.Nevertheless, the inferred stellar mass function agrees well with the SHMR-scaled HMF up to z ≈ 7.This suggests that the most efficient haloes during these early epochs are not around M * , but rather the most massive ones, and with little observed evolution consistent with the findings from Stefanon et al. (2021).This phenomenon has also been observed in some local massive spiral galaxies (e.g., Di Teodoro et al. 2023), which have seemingly matured without significant star formation cessation.One interpretation is that this high star formation efficiency in massive haloes is the result of diminished feedback from AGN, with stellar mass growing similarly to the host halo at early times.Indeed, this is consistent with findings of inefficient radiative AGN feedback from simulations (Kaviraj et al. 2017;Laigle et al. 2019;Roos et al. 2015;Bieri et al. 2017;Habouzit et al. 2022), as well as FIR/radio observations of AGN activity at z > 3 (Maiolino et al. 2012;Cicone et al. 2014Cicone et al. , 2015;;Padovani et al. 2015;Vito et al. 2018).
At no point do our mass functions, observed or inferred, exceed this upper limit.Therefore we do not report evidence of "impossibly early galaxies" introduced by Steinhardt et al. (2016) who point out that there appears to be too many massive galaxies at z > 4 compared to the dark matter haloes that should host them.However at z > 6.5, where Steinhardt et al. predicts that the effect will be most obvious, we report observed number densities approaching this upper limit and in clear excess of the SHMR-scaled HMF.We caution that these sources are the most vulnerable to misclassification and bias, being susceptible to blending in addition to being constrained by only a handful of NIR bands yielding proportionately uncertain Schechter fits.Extrapolation to M ≳ 10 11.5 M ⊙ would place their number density below that which can be probed in a volume contained by the 0.716 deg 2 area of the Ultra-Deep region, and so COSMOS2020 is unlikely to find them if they exist.While they are not "impossibly early galaxies", their surprising abundance hint that explorations of z > 7 with future deep, large-volume surveys may provide the evidence necessary to firmly challenge theoretical frameworks.

Summary & conclusions
Following on from COSMOS2020 photo-z catalog (Weaver et al. 2022), we study the shape and evolution of the galaxy stellar mass function.Our measurements span three decades in mass (at z ≈ 0.2) across 10 billion years of cosmic history (z = 7.5 → 0.2), including the most mass complete sample of quiescent galaxies at z > 2 enabled by our unprecedented, homogeneous NIR depths across an effective 1.27 deg 2 .We probe a volume nearly 2× that of Davidzon et al. (2017) which not only improves sample statistics but also finds new galaxies of still greater mass at all redshifts.Complementary deep IRAC coverage has allowed us to directly measure the stellar bulk, and hence galaxy stellar mass, in a less biased way and to higher redshifts compared to K s -based measurements.We developed a robust, mass-dependent error budget with contributions from poisson, stellar mass, and cosmic variance, and account for the effects of Eddington bias by fitting a kernel-convolved Schechter function to our observed SMF.We use three fitting techniques, including the continuity model of Leja et al. (2019a), finding good agreement with literature measurements with smaller binto-bin variance with z.We stress that our derived parameters are dependent on the assumed Eddington correction, and while the inferred SMF evaluated at maximum likelihood and associated parameters are robust, parameters (and their uncertainties) derived from the median of posterior distributions can be unreliable if constraints are weak.To make these results more transparent and to facilitate comparisons, we have made the object IDs and key measurements presented in this work available to download 14 .
Although literature comparisons on the shape of the SMF at fixed redshift show good agreement, the novel advantage of COSMOS2020 is the extended historical baseline over which the mass functions (as well as many other properties) can be consistently measured.Not only do we examine the evolution of the integrated mass density ρ * over this time, we also examine the remarkably consistent rate of growth in the number densities of systems of different masses from z ≈ 7 → 1, whereupon the most massive star-forming galaxies become quiescent faster than they can be replaced.Similarly, we find evidence for the sharp rise in low-mass quiescent systems consistent with the phenomenological model of Peng et al. (2010) probed to z ≈ 2.5 where our sample becomes incomplete.Although tentative, we also find evidence for a sustained > 20% fraction of high-mass quiescent systems from z ≈ 5 → 2.
14 https://doi.org/10.5281/zenodo.7808832Furthermore, we highlight three main results: -Comparisons with several hydrodynamical simulations and semi-analytical models indicate an exceptional degree of agreement for the most massive galaxies out to high-z.This comes despite the surprisingly high number densities of massive galaxies at z ≈ 3 − 5 in excess of a Schechter function, suggesting that existing physical recipes are assembling massive M ≈ 10 10−11.5 M ⊙ systems in sufficient quantity at early times.In order to explore star formation cessation and feedback modes, we identify quiescent galaxies out to z ≈ 5.5 by means of a NUVrJ selection and compare them to consistently selected quiescent samples produced by two simulation codes, finding evidence for delayed assembly of lowmass quiescent systems in Eagle, and too rapid assembly in the Shark.-A closer examination of these massive systems reveals that a quarter are not found in COSMOS2015.Not only are they K s -faint, but their extremely red colors challenge SED fitting templates.We find no strong evidence for AGN contamination, although we stress the need for future infrared facilities with deep surveys capable of measuring the rest-frame MIR light at z ≈ 3 − 5. Recent findings of optically dark galaxies from IRAC and ALMA suggest that previous studies have missed contributions from dust-obscured star-forming galaxies.Their brightness, redshifts, mass, and number densities are consistent with our findings, suggesting that the K s ∼ 26 depth of UltraVISTA DR4 may indeed be sufficient to reach out into the tail end of this population missed by previous optical-NIR selections.Further work is required to conclusively establish the nature of these massive galaxy candidates.-Lastly, we investigate the connection to dark matter halos by comparing both our observed and inferred SMFs to constraints provided by the HMF.While we confirm the divegence of the SMF from the HMF at both low-and highmass regimes which has been historically intepreted as evidence for feedback processes, the massive end of the inferred, Schechter-fit SMF comes into agreement with the HMF at z ≳ 2. While we find no evidence of tension which would challenge theoretical models, our observed number densities at z ≈ 3 − 5 approach the upper limit for fully efficient star formation in the most massive halos.Larger volume surveys containing even more massive systems, if they exist, may be able to challenge these models, especially at z ≳ 6 − 7.
The launch of JWST has opened the door on a new era, and it will soon be flanked by efficient survey facilities from space (Euclid, Roman) and the ground (Rubin).While massive quiescent systems may exist at z ∼ 5 and perhaps even at earlier times (Mawatari et al. 2016(Mawatari et al. , 2020)), their identification is beyond the reach of COSMOS2020.They may be identified soon by deep degree-scale JWST surveys i.e., COSMOS-Web (Kartaltepe et al. 2021;Casey et al. 2022) and possibly by narrower ones including the JWST Advanced Deep Extragalactic Survey (JADES, Eisenstein et al. 2017;Ferruit 2017), the Cosmic Evolution Early Release Science Survey (CEERS, Finkelstein et al. 2017), the Next Generation Deep Extragalactic Exploratory Public Survey (NGDEEP, Finkelstein et al. 2021), and Ultra-deep NIRCam and NIRSpec Observations Before the Epoch of Reionization (UNCOVER, Labbé et al. 2021;Bezanson et al. 2022) to name a few.
While low mass quiescent systems at z ≳ 2.5 may be found by JWST (with hints emerging from Marchesini et al. 2023), a truly precise quantification of their number density and demographics is likely to remain a future objective.By extrapolating these observations, < 1% of M ≈ 10 9.5−10.0M ⊙ are expected to be quiescent by z ∼ 2.5, become even rarer at earlier times.While identifying even one quiescent, low-mass system at z > 2 in the absence of virialized structure would present a significant challenge to the paradigm of Peng et al. (2010), performing a statistically meaningful survey of them will require incredibly deep, degree-scale NIR surveys, placing them out of reach by current facilities.While the deepest degree-scale surveys from JWST (COSMOS-Web), Roman, and Euclid stand to establish the rarity of these systems at z ≈ 2 − 3, it seems that no currently planned survey will be able to definitively quantify their contribution at z ≳ 3.
While explorations of mass-selected samples at z > 7 are being be made possible by JWST, the most UV-luminous sources at these epochs will likely be missed by small area programs and yet are crucial to a comprehensive study of the ionizing UV budget (Kauffmann et al. 2022;Donnan et al. 2023).Following up known samples with JWST will not only allow us to measure the star formation rate and dust content of the first ultraluminous galaxies from deep within the epoch of reionization, but also to directly identify the progenitors of z ∼ 3 − 4 massive galaxies while still in their formation stage (e.g., Weaver et al. 2021).Yet despite the incredible promises of highly resolved IR imaging and spectroscopy from JWST, identifying the rarest and potentially most informative populations that can challenge and thereby improve galaxy formation models will remain the domain of wide-field surveys.Log 10 ( / ) 3.5 < z 4.5  Log 10 ( / ) Log 10 ( / ) .Fits for the median posterior will be similar for symmetric parameter posterior distributions, which are found in all but the last bin shown here.Kernel convolved fits are shown by the dashed curves.
Fits measured using the continuity model are shown in gray solid curves.Lower panels indicate the fraction of quiescent galaxies F QG ≡ Φ QG / (Φ SF + Φ QG ) as a function of log 10 (M/M ⊙ ) for the data (colored points) and maximum likelihood fits (solid curve).
Since the star-forming and quiescent fits are not required to sum to the total fit, changing the definition to F QG ≡ Φ QG / Φ Total as reported by Ilbert et al. (2013) and Davidzon et al. (2017) produces the dotted curves with noticeably lower F QG at high M.

Fig. 1 :
Fig.1: Galaxies are classified as either star-forming or quiescent in bins of redshift by selection in rest frame NUV − r and r − J colors, followingIlbert et al. (2013).The rest-frame J band is not directly measured at z ≳ 3.5 and so the sloped part of the selection box is dashed to indicate that the r − J colors are strongly model dependent.Typical uncertainties for the rest-frame colors of the star-forming and quiescent subsamples are estimated as medians of the nearest observed-frame bands, consistent with the derivation of the rest-frame colors.See the text for details.

Fig. 4 :
Fig. 4: Likelihood distributions of galaxy stellar mass are derived from SED fitting with LePhare and assume a fixed redshift.Individual distributions (gray) are summarized by a median stack (blue) grouped by redshift and stellar mass (indicated in ranges of log 10 (M/M ⊙ )).Estimates of standard deviation σ are shown.The size of a typical mass bin used in this work is 0.25 dex, indicated by the pair of dotted gray lines in each panel.

Fig. 5 :
Fig.5: Evolution of the galaxy stellar mass function over 12 redshift bins (0.2 < z ≤ 7.5) for the total, star-forming, and quiescent samples.Mass incomplete bins based on the channel 1 limiting magnitude are not shown.For the quiescent sample, bins that are fully incomplete based on the K s limiting magnitude are indicated by cross hatching.
2 by comparing with two SMFs previously measured in the same field: Ilbert et al. (2013) and Davidzon et al. (2017); both use photo-z and M computed with LePhare over COSMOS, out to z ≈ 4.0 and z ≈ 5.5, respectively, with the nearly same redshift binning scheme as we use.We note one exception that Ilbert et al. bins sources at 3.0 < z < 4.0 whereas Davidzon et al. uses 3.0 < z ≤ 3.5 and 3.5 < z ≤ 4.5, and so we opt to follow the scheme of Davidzon et al. as it allows a comparison up to higher redshift and omit the comparison with the highest-z measurement of Ilbert et al.. Additionally, Davidzon et al.only considered sources in the ultra-deep regions of COSMOS2015 (corresponding to UltraVISTA (DR2) as the spatial inhomogeniety of the NIR bands implies a significant variation in selection function and mass completeness between the deep and ultra-deep regions.Thankfully, COSMOS2020 (corresponding to UltraV-ISTA DR4) contains nearly uniform NIR coverage (∆ ≈ 0.4 mag,

Fig. 6 :
Fig. 6: Evolution of the galaxy stellar mass function across 12 redshift bins (0.2 < z ≤ 7.5).The 0.2 < z ≤ 0.5 SMF from the first redshift bin is repeated in each panel for reference shown by the purple dotted curve.Two other COSMOS stellar mass functions from Ilbert et al. (2013) and Davidzon et al. (2017) are shown for comparison, along with Grazian et al. (2015) from UDS/GOODS-S/HUDF and the recent work of Stefanon et al. (2021) from GREATS at z > 6. Mass incomplete measurements are not shown.Upper limits for empty bins are shown by the horizontal gray line with an arrow.

Fig. 7 :
Fig.7: Evolution of the star-forming (blue) and quiescent (orange) galaxy components of the galaxy stellar mass function in nine bins of redshift (0.2 < z < 4.5).Uncertainty envelopes correspond to 1 and 2σ limits.For comparison, we show other literature studies in COSMOS fromIlbert et al. (2013) andDavidzon et al. (2017) that adopt similar selections and methodologies.For reference, the total SMF is shown in each bin (solid gray) and the 0.2 < z < 0.5 total SMF is repeated in each panel (purple dotted).Mass incomplete measurements as defined by the channel 1 limiting magnitude are not shown, with the mass limits corresponding to the K s limiting magnitude shown by orange arrows.Upper limits for empty bins are shown by the horizontal gray line and arrow.

Fig. 8 :
Photo-z 2 is smaller than that found byDavidzon et al. (0.35), τ c ≈ 0.1 is conspicuously larger (0.04, same as inIlbert et al. 2013).If we instead fit to the L(M | z) distributions directly, we find σ Edd ≈ 0.05 and τ c ≈ 0.04.These are much smaller than Davidzon et al., which may be explained in part by the slightly different approach they used to estimate L(M) that more directly incorporates uncertainties on photo-z (see Section 4.1 ofDavidzon et al.).Nevertheless, we opt for the pessimistic case and fix the kernel shape and evolution with z to σ Edd ≈ 0.2 and τ c ≈ 0.1 for the subsequent analysis of the total, as well as the star-forming and quiescent SMFs; their shapes are shown in the inset panels inFigs.C.1,  C.2, and C.3, respectively. Photo-z Fig. C.4 and discussed in Appendix C. The evolution of the Schechter parameters inferred from the total SMF are shown in the leftmost panels Fig. 11, and compared with Davidzon et al. in the center panels.Table C.1 records the best-fit parameters, with detailed fits shown in Fig. C.1.

Fig. 10 :
Fig. 10: Summary of inferred galaxy stellar mass function evolution.Best-fit parameters are estimated by a Markov Chain Monte Carlo fitting with a Schechter function at fixed redshift convolved with a redshift dependent kernel from which the inferred intrinsic mass function is recovered with the unconvolved Schechter fit (colored solid curves) for the total (leftmost), star-forming (center left), and quiescent (center right) samples.Estimates are shown corresponding to both the median posterior parameter values including 1σ envelopes (upper row) and the single set of maximum likelihood parameters (bottom row).The rightmost panel shows the results of fitting the Double Schechter continuity model ofLeja et al. (2019a) to the total SMF between 0.2 < z ≤ 3.0 where the fits are reliable; see later in Section 5.3 for details.
. Tables C.2 and C.3 record the best-fit parameters, with fits shown in Figs.C.2 and C.3, respectively.Adapting the continuity model to the star-forming and quiescent SMFs is left to future work.

Fig. 11 :
Photo-z ) and McLeod et al. (2021), although generally in agreement, are also complicated by differences in selection.Whereas Santini et al. (2021) adopts a novel SFR-driven selection, McLeod et al. (

Fig. 13 :
Fig. 13: Comparison of observed and inferred galaxy stellar mass function (gray points, and colored curve with 1 and 2σ envelopes, respectively) to the reference flavors of four simulations: TNG100 of the IllustrisTNG project (Pillepich et al. 2018), Eagle (Furlong et al. 2015), Shark (Default; Lagos et al. 2018), Santa Cruz (Yung et al. 2019a,b), and Simba (Flagship; Davé et al. 2019).To note, we do not apply any artificial normalization to any model or observation.Upper limits for empty bins are shown by the horizontal gray line with an arrow.Mass incomplete measurements are not shown.

Fig. 14 :
Fig.14: K s magnitude and H − K s colors of 125 new massive M > 10 11 M ⊙ galaxy candidates 3.0 < z ≤ 5.5 found in COS-MOS2020 (orange), relative to the total sample used in this work (gray) and the 444 galaxies found in COSMOS2015 (blue).Measurements are taken from COSMOS2020, although they are similar to those from COSMOS2015, where matched.Number densities are summarised using gaussian kernel density estimators to produce smoothed distributions and contours for each sample.Median K s and H − K s values for each sample are shown by colored dotted lines.

Fig. 15 :
Fig.15:Comparison of observed number densities and upper limits on the rarity probed by various observed (gray steps) and simulated (colored lines) volumes.Number densities correspond to 10 11.0 < M < 10 11.5 M ⊙ and 10 11.5 M < 10 12.0 M ⊙ from the total sample (light and dark gray, respectively), and 10 11.5 M < 10 12.0 M ⊙ from the quiescent sample (orange) from Fig.8. 1 σ upper limits are computed followingGehrels (1986), which for the observed volumes are dependent on widths of redshift bins.

Fig. 17 :
Fig.17: Comparison of the inferred z > 1.5 galaxy stellar mass function (colored curve with 1 and 2 σ envelopes) fitted to observed measurements (gray points).We scale theTinker et al. (2008) halo mass function (HMF; lower bound of the gray shaded region) by 0.018 corresponding to the stellar-to-halo mass relation at z = 0 and M h = M * h(Behroozi et al. 2013) to produce an idealized SMF (gray curve).Upper limits on the SMF at each redshift are derived from the HMF assuming a fixed baryon fraction (0.166).Upper limits for empty bins are shown by the horizontal gray line with an arrow.Mass incomplete measurements are not shown.

Fig. B. 1 :
Fig. B.1: Galaxy stellar mass functions for total (gray), star-forming (blue), and quiescent (orange) samples from this work compared to simulations.We include those from Universe MachineBehroozi et al. (2019, sSFR< 10 −11 M ⊙ yr −1 ) shown by dotted lines with hatched uncertainty envelopes, as well as consistently NUVrJ-selected samples from Eagle(Furlong et al. 2015) and Shark(Lagos et al. 2018) shown by the dashed and dot-dashed lines, respectively.Upper limits for empty bins are shown by the horizontal gray line with an arrow.Mass incomplete measurements are not shown.

Fig. C. 4 :
Fig. C.4:Best fit Schechter models corresponding to maximum likelihood for the total, star-forming, and quiescent samples (solid colored curves) based on the observed data (colored points).Fits for the median posterior will be similar for symmetric parameter posterior distributions, which are found in all but the last bin shown here.Kernel convolved fits are shown by the dashed curves.Fits measured using the continuity model are shown in gray solid curves.Lower panels indicate the fraction of quiescent galaxies F QG ≡ Φ QG / (Φ SF + Φ QG ) as a function of log 10 (M/M ⊙ ) for the data (colored points) and maximum likelihood fits (solid curve).Since the star-forming and quiescent fits are not required to sum to the total fit, changing the definition to F QG ≡ Φ QG / Φ Total as reported byIlbert et al. (2013) andDavidzon et al. (2017) produces the dotted curves with noticeably lower F QG at high M.
Davidzon et al. predicts a lower volume density than either Ilbert et al. or our measurements, which lie in agreement.This is not surprising, as Davidzon et al. expanded the template library of LePhare to include both starbursting and dust-obscured systems (see Sect. 3 of Davidzon et al.) and so our work is naturally more similar to that of Ilbert et al.