Tracing the rise of supermassive black holes: A panchromatic search for faint, unobscured quasars at z ≳ 6 with COSMOS-Web and other surveys

We report the identification of 64 new candidates of compact galaxies, potentially hosting faint quasars with bolometric luminosities of L bol = 10 43 – 10 46 erg s − 1 , residing in the reionization epoch within the redshift range of 6 ≲ z ≲ 8. These candidates were selected by harnessing the rich multiband datasets provided by the emerging JWST-driven extragalactic surveys, focusing on COSMOS-Web, as well as JADES, UNCOVER, CEERS, and PRIMER. Our search strategy includes two stages: applying stringent photometric cuts to catalog-level data and detailed spectral energy distribution fitting. These techniques e ff ectively isolate the quasar candidates while mitigating contamination from low-redshift interlopers, such as brown dwarfs and nearby galaxies. The selected candidates indicate physical traits compatible with low-luminosity active galactic nuclei, likely hosting ≈ 10 5 –10 7 M ⊙ supermassive black holes (SMBHs) living in galaxies with stellar masses of ≈ 10 8 –10 10 M ⊙ . The SMBHs selected in this study, on average, exhibit elevated mass compared to their hosts, with the mass ratio distribution slightly higher than those of galaxies in the local universe. As with other high-z studies, this is at least in part due to the selection method for these quasars. An extensive Monte Carlo analysis provides compelling evidence that heavy black hole seeds from the direct collapse scenario appear to be the preferred pathway to mature this specific subset of SMBHs by z ≈ 7. Notably, most of the selected candidates might have emerged from seeds with masses of ∼ 10 5 M ⊙ , assuming a thin disk accretion with an average Eddington ratio of f Edd = 0 . 6 ± 0 . 3 and a radiative e ffi ciency of ϵ = 0 . 2 ± 0 . 1. This work underscores the significance of further spectroscopic observations, as the quasar candidates presented here o ff er exceptional opportunities to delve into the nature of the earliest galaxies and SMBHs formed during cosmic infancy.


Introduction
Powered by gas and dust accretion onto supermassive black holes (SMBHs), quasars are among the brightest entities in the Universe with the corresponding active galactic nucleus (AGN) bolometric luminosities reaching L bol ≳ 10 46 erg s −1 .Thanks to various wide-field sky surveys, to date, more than 200 quasars hosting ≳ 10 9 M ⊙ black holes have been discovered at z ≳ 6, with a select number of them already shining brightly when the cosmos was just less than 800 Myr old (see, e.g., Fan et al. 2023, for a recent review).Assuming that such SMBHs originate from less massive seeds (i.e., ≈ 10 2 -10 6 M ⊙ ), assembling those enormous amounts of mass is challenging, requiring highly efficient matter accretions with additions of black hole mergers (Woods et al. 2019;Pacucci & Loeb 2020).Hence, these high-z quasars, with their extreme characteristics compared to inactive galaxies, are ideal targets for examining the assembly of the earliest galaxies and SMBHs during cosmic infancy (Pacucci & Loeb 2022).
Several studies have proposed explanations for constructing the black hole seeds, although the comprehensive solution to this problem is still open-ended.These theories include the idea that the first generation of low-mass black holes are presumably produced at the same time when the first-generation stars (hereafter Population III stars) are populating the Universe at z ∼ 20-30, or around 200 Myr since the Big Bang (Volonteri et al. 2021).In line with that, black hole seeds are often separated into two classes, depending on their initial mass: (i) heavy seeds with a mass range of 10 4 -10 6 M ⊙ and (ii) light seeds with masses of 10-100 M ⊙ (see, e.g., Inayoshi et al. 2020, and references therein).
One challenge of growing light seeds to form 10 9 M ⊙ SMBHs by z ≈ 7 is there is simply not enough time unless episodes of super-or even hyper-Eddington accretion can be sustained (e.g., Middleton et al. 2013;Madau et al. 2014;Dubois et al. 2014;Valiante et al. 2016;Pezzulli et al. 2016;Pacucci et al. 2017;Natarajan 2021).While the super-Eddington accretion rate is just slightly above the Eddington-limit rate but still around the same order of magnitude, hyper-Eddington events can have values that are hundreds of times higher owing to photon trapping mechanisms reducing the radiation pressure effect on the infalling matter (Begelman & Volonteri 2017).However, since most of the quasars discovered today are observed as having instantaneous accretion rates below or around the Eddington limit (Trakhtenbrot et al. 2017;Fragione & Pacucci 2023), the theory on heavy seeds is thus being explored further to ease the time-limited SMBH growth issue and possibly jump-start the formation of high-z quasars (e.g., Yoo & Miralda-Escudé 2004;Volonteri 2010;Mayer & Bonoli 2019).As the first possibility, heavy seeds could form by collapsing primeval gas residing in the atomic-cooling halo, potentially producing short-lived supermassive stars (or quasi-stars) as by-products with a mass range of 10 5 -10 6 M ⊙ (Bromm & Loeb 2003;Lodato & Natarajan 2006;Hosokawa et al. 2013;Smith & Bromm 2019).The second possibility of heavy seed formation is that runaway collisions and mergers of either black holes or Population III stars within a gasdense environment -namely, a dense star cluster -could produce seeds with masses of 10 3 -10 4 M ⊙ (Alexander & Natarajan 2014;Lupi et al. 2016;Boekholt et al. 2018;Latif et al. 2021;Massonneau et al. 2023;Trinca et al. 2023).Heavy seeds might reduce the discrepancy between the theoretical model of SMBH growth and the observed quasar properties.However, such objects have yet to be detected (Nabizadeh et al. 2023;Natarajan et al. 2023).
Discovering more quasars in the reionization era is one obvious pathway for understanding early SMBH formation.In particular, finding less massive black holes (≈ 10 6 -10 8 M ⊙ ) at higher redshifts might give more information on whether heavy seeds are the dominant channel to explain the majority of the z ≳ 7 quasar population.Only the most luminous quasars, and hence, the largest, rarest SMBHs, could be discovered before the launch of the James Webb Space Telescope (JWST; Mortlock et al. 2011;Bañados et al. 2019;Matsuoka et al. 2019;Venemans et al. 2020;Wang et al. 2021;Yang et al. 2021;Izumi et al. 2021;Andika et al. 2022).Today, JWST is allowing for high-z lower-luminosity AGNs (L bol ≈ 10 43 -10 45 erg s −1 ) to be hunted where the stellar light might dominate the total emission or where the central emission from the accretion process is obscured (e.g., Labbe et al. 2023;Maiolino et al. 2023a,b;Larson et al. 2023;Fujimoto et al. 2023c;Goulding et al. 2023;Furtak et al. 2023;Kokorev et al. 2023;Greene et al. 2023;Williams et al. 2023a;Kokorev et al. 2024;Pérez-González et al. 2024).About 30 lower mass SMBHs have been reported so far, and these objects might be the missing connection between the earliest bright quasars and black hole seeds.
Given the necessity of understanding how the first SMBHs and galaxies evolve, we present 64 new compact sources, potentially harboring quasars with L bol ≲ 10 46 erg s −1 and ≲ 10 8 M ⊙ SMBHs at 6 ≲ z ≲ 8, selected utilizing various groundand space-based imaging data.Specifically, we exploit publicly available archival datasets covering the COSMOS, GOODS-S/N, Abell 2744, EGS HST legacy, and PRIMER extragalactic fields.If spectroscopically confirmed, our candidates will double the number of quasars in the mass, luminosity, and redshift ranges mentioned earlier.Furthermore, combining our samples with other published quasars in the literature will allow us to perform statistical analysis on this intriguing population and check their black hole and host galaxy characteristics.
The outline of this paper is as follows.We start with the details on data acquisition and main database construction in Section 2.Then, the method for identifying quasar candidates via photometric and spectral energy distribution (SED) modeling will be presented in Section 3.After that, we deliver the results and discuss the properties of the new candidates in Section 4. Finally, we end this paper with a summary and conclusions in Section 5.For simplification and ease of reference within this paper, we subsequently define "quasar" as an interchangeable term for quasi-stellar object (QSO) and active galactic nucleus (AGN).On several occasions, low-luminosity AGNs with L bol ≲ 10 46 erg s −1 , whose emission could be overwhelmed by the host galaxy's light but the AGN contribution is still detectable are also considered as quasars.The magnitudes written in this paper are reported using the AB system.We further adopt the flat ΛCDM cosmological framework, where we as-sume Ω Λ = 0.7, Ω m = 0.3, and H 0 = 70 km s −1 Mpc −1 .Consequently, at z = 7, the Universe's age is 0.748 Gyr, and the angular scale of θ = 1 ′′ corresponds to a linear scale of 5.3 kpc.

Multi-survey datasets
This section outlines the multiband photometric datasets used for the high-z quasar selection in several major JWST extragalactic fields: COSMOS, GOODS-S/N, Abell 2744, EGS HST legacy, and PRIMER.Some details on each of these surveys, data processing, and catalog construction will also be discussed here.The unified database is then utilized to perform preselection and SED modeling to find promising candidates.

The COSMOS-Web survey
The first dataset is based on the COSMOS-Web program (GO #1727, PIs Kartaltepe & Casey), a deep imaging program covering 0.54 deg 2 with 255 hours total integration time.COSMOS-Web uses four JWST/NIRCam bands (F115W, F150W, F277W, and F444W) and one MIRI filter (F770W) in parallel.More details on the survey description and observing strategy are presented by Casey et al. (2023).Our work utilizes the first two epochs of COSMOS-Web data obtained in January and April 2023.The current available NIRcam mosaics cover approximately 0.28 deg 2 ; on the other hand, MIRI data contains 0.07 deg 2 of the COSMOS-Web field.
Data reduction for the NIRCam images is carried out utilizing the standard JWST Calibration Pipeline (Bushouse et al. 2022).In addition to that, custom processing steps are implemented to improve the image quality.This includes 1/f noise and low-level background subtraction (e.g., Bagley et al. 2022) and astrometric correction bootstrapped from the Hubble Space Telescope (HST) imaging in the F814W filter (Koekemoer et al. 2007) and the COSMOS2020 catalogs (Weaver et al. 2022), anchored to the Gaia-EDR3 data (Gaia Collaboration et al. 2023).The resulting multiband image mosaics with 0 ′′ .03/pixel have an astrometric normalized median absolute deviation below 12 mas.Accordingly, MIRI data are reduced using a similar process to produce 0 ′′ .06/pixel mosaics.While we only give a short overview here, two forthcoming papers will discuss details of the reduction process (Franco et al.; Harish et al., in prep.).
We complement the JWST data with multiwavelength information from various surveys performed on the Cosmic Evolution Survey (COSMOS) field.This includes photometric datasets from HST/F814W (Scoville et al. 2007;Koekemoer et al. 2007), Spitzer/IRAC (Euclid Collaboration et al. 2022), Subaru/HSC PDR3 (Aihara et al. 2022), and UltraVISTA DR5 (McCracken et al. 2012).A detailed summary of how these data are compiled and reprocessed is provided by Weaver et al. (2022).Furthermore, we add submillimeter measurements from the A3COSMOS catalog (Liu et al. 2019) when available.
The COSMOS-Web photometric catalog is produced using the SourceXtractor++ code (SE++; Bertin et al. 2020Bertin et al. , 2022)).To create a detection image for reference, we first stack all four NIRCam bands via a chi-square (χ 2 ) combination (Szalay et al. 1999).Flux measurements are then performed on each band using model-based photometry, including the ancillary data from HST and other ground-based observations.We note that modelbased photometry enables flux extraction on images with diverse point spread functions (PSFs) without degrading their quality (see also Weaver et al. 2023b).Specifically, this approach allows us to include constraints from ground-based data without sac- rificing space-based data's resolution and, consequently, photometric accuracy.In total, 342,435 sources are obtained from this catalog.
It should be noted that the flux errors of faint or undetected targets are often underestimated due to the flexibility given to the SE++ catalog construction.To handle this issue, we set a noise floor in each band equivalent to the shot noise calculated using circular apertures placed randomly with sizes of 0 ′′ .3 and 1 ′′ for space-based and ground-based data, respectively.Furthermore, to compute the source detection's significance, parameterized with the signal-to-noise ratio (S/N), we also consider the fluxto-error ratio extracted using an aperture of 1 ′′ .5 diameter.This measurement is more robust than the model-based photometry S/N, and the aperture size is large enough to capture the whole source light, given the different PSF sizes between image filters.The details on the photometric catalog creation will be described in a separate work (Shuntov et al., in prep.).

The JADES project
Multiband data of the Great Observatories Origins Deep Survey South (GOODS-S) sky field is taken from the first public release of the JWST Advanced Deep Extragalactic Survey (JADES 1 ; Eisenstein et al. 2023) observations.This dataset covers the "deep" portion of the images with exposure time per filter of 3.9-16.7 hours obtained in September/October 2022, resulting in a sky area of 25 arcmin2 with a nominal 5σ depth of around 29.9 mag.The JWST/NIRCam filters utilized by the JADES project include F090W, F115W, F150W, F200W, F277W, F335W, F356W, F410M, F444W -that is, spanning the wavelengths of 0.8-5.0µm.Photometry for 47,181 unique targets is provided in the catalog, where the source extractions and measurements are explained in detail by Hainline et al. (2023).
The JADES catalog also makes use of the JWST Extragalactic Medium-band Survey (JEMS; Williams et al. 2023b) data, adding F182M, F210M, F430M, F460M, and F480W filters.Moreover, observations from the First Reionization Epoch Spectroscopic COmplete survey (FRESCO; Oesch et al. 2023) are also included, complementing the JADES catalog with F182M, F210M, and F444W filters when available.As for the bluer wavebands, JADES utilized the existing deep HST/ACS and WFC3 mosaics from the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS; Grogin et al. 2011; 1 https://jades-survey.github.ioKoekemoer et al. 2011) and the Hubble Legacy Field dataset (Whitaker et al. 2019) containing F435W, F606W, F775W, F814W, and F850LP images.Finally, it is worth mentioning that all fluxes we use for the SED fitting later are based on the ones measured within a circular aperture with a radius of 0 ′′ .15, corrected for flux losses, in the "CIRC_CONV" table of the JADES catalog.

The UNCOVER program
The search on the lensing cluster Abel 2744 region will be conducted using the data provided by the Ultradeep NIRSpec and NIRCam ObserVations before the Epoch of Reionization (UN-COVER 2 ; Bezanson et al. 2022) Cycle 1 JWST Treasury program.The second version of the photometric catalog released by this program is constructed based on the 49 arcmin 2 image mosaics of seven NIRCam filters -that is, F115W, F150W, F200W, F277W, F356W, F410M, and F444W -together with numerous HST/ACS and WFC3 ancillary data.
For photometry purposes, all UNCOVER mosaics are PSFmatched to the F444W band, and the detection images for source extractions are created by combining the F277W, F356W, and F444W mosaics exploiting the noise-equalized technique.Specifically, the so-called "SUPER" catalog that we will use here, where the photometry is calculated from optimally selected color apertures in the range of 0 ′′ .32 -1 ′′ .4 diameter on 0.04 ′′ /pixel mosaics, reaches a nominal 5σ magnitude limit of around 30 mag.We note that the fluxes in that catalog are corrected to total values using the Kron Radius measured in the detection image, with an additional correction of approximately 5-10% applied to account for missing light beyond a 1 ′′ radius, guided by the F444W curve of growth (Weaver et al. 2023a).By default, fluxes for 61,648 unique sources in the UN-COVER catalog are reported in the unit of 10 nJy or correspond to the AB magnitude zero point of 28.9.This catalog is further enriched with submillimeter measurements from the Deep UNCOVER-ALMA Legacy High-Z (DUALZ) Survey, featuring ALMA band 6 observations with a 30-GHz wide frequency band down to a sensitivity of 32.7 µJy beam −1 (Fujimoto et al. 2023b).

Additional archival data
In addition to the previously mentioned datasets, we also make use of the JWST data targeting some public extragalactic fields, which were processed with grizli (Brammer et al. 2022) and msaexp (Brammer 2023) by the Cosmic Dawn Center (DAWN), stored in the DAWN JWST Archive (DJA3 ; Valentino et al. 2023).Specifically, we first mined the Cosmic Evolution Early Release Science Survey (CEERS4 ; Finkelstein et al. 2023) data provided by DJA to expand our candidates list.We refer the reader to Bagley et al. (2023) for a complete description of the official CEERS data products.In short, the dataset consists of NIRCam imaging in F115W, F150W, F200W, F277W, F356W, F410M, and F444W bands targeting the Extended Groth Strip (EGS) HST legacy field with the current area coverage of 91 arcmin 2 and a 5σ depth of 28.3-28.8mag.
Along with that, we also exploited the DJA's version of the Public Release IMaging for Extragalactic Research (PRIMER5 ; Dunlop et al. 2021) dataset.The PRIMER survey was performed utilizing the NIRCam and MIRI imaging on two contiguous equatorial regions, namely, the Ultra-Deep Survey (UDS; Lawrence et al. 2007) and COSMOS (Scoville et al. 2007) fields.While the used filters are similar to CEERS, PRIMER further enriches the covered wavelengths by adding F090W, F770W, and F1800W bands.In total, the areas covered by the PRIMER-UDS and PRIMER-COSMOS reach about 212 arcmin 2 and 164 arcmin 2 , respectively, with a 5σ limiting magnitude of 27.4-27.9mag.As further information, complementary to the CEERS and PRIMER's JWST data, DJA also provides photometric measurements based on the existing HST archival images (i.e., Grogin et al. 2011;Koekemoer et al. 2011;Kokorev et al. 2022).
At the time of writing, the photometric catalog of the GOODS-N field, along with some GOODS-S regions from the "non-deep" portion of the JADES programs, has yet to be released by the official JADES collaboration (Eisenstein et al. 2023).Fortunately, a subset of their NIRCam mosaics is publicly available and processed by DJA, covering the area of about 55 arcmin 2 and 57 arcmin 2 for the northern and southern datasets, respectively.These images include additional data from the observing programs of JADES Medium, 1210/1286 Parallel, and northwest and southeast pointings.Similar to the dataset introduced in Section 2.2, DJA complemented the JADES GOODS-S/N data with the NIR imaging from the JEMS (Williams et al. 2023b) and FRESCO (Oesch et al. 2023) projects, as well as the optical photometry from the Hubble Legacy Fields program (Whitaker et al. 2019).With all data in hand, we ultimately consider the aperture-based photometry, corrected for flux losses, calculated with a diameter of 0 ′′ .36 for CEERS, PRIMER-UDS, PRIMER-COSMOS, GOODS-N, GOODS-S catalogs produced by DJA, each containing 76,300, 143,552, 118,794, 37,890, and 52,427 objects, respectively.It is important to note that the GOODS-S dataset constructed here and the one obtained in Section 2.2 are then merged to remove duplicated sources by crossmatching these two catalogs using a 1 ′′ radius.This combined catalog is hereafter called "JADES" to differentiate them from the GOODS-N data.
At this point, we then compile the catalogs from the COSMOS-Web, JADES GOODS-S/N, UNCOVER, CEERS, and PRIMER projects.All fluxes are converted to nJy unit, corresponding to AB zero point of 31.4 mag.To further ensure that bright flux values do not excessively influence the SED fitting process later and to accommodate potential uncertainties in photometric calibration, we designate a lower limit of 5% as the error floor for the photometric measurements (e.g., Hainline et al. 2023).The effect of Galactic extinction is then corrected using the dust map of Schlegel et al. (1998) and reddening correction of Fitzpatrick (1999), applied using the software from Green (2018).
This resulting parent catalog comprises 851,518 unique sources (see Table 1 for the breakdown), which includes a mix of galaxies and quasars at all redshifts, stars, substellar objects, and artifacts.The following sections will describe various steps to extract the actual quasar content and resulting AGN properties, with the steps already listed in Table 1.First, the SEDs of all objects are modeled with composite SED templates representative of galaxies with and without AGN, as well as stars and substellar objects, including dust reddening.This SED modeling will robustly remove all non-galaxies from the catalog, low-redshift galaxies, and AGN with a photometric z < 5.5.The resulting much smaller sample of high-z candidates is then visually inspected to remove objects with SEDs impacted by cosmic ray hits, hot pixels, stray light residuals, etc.This approach will provide a high-probability set of high-z candidate galaxies and AGN we already discussed.Then, in the final step, detailed independent SED fitting is used to extract relative galaxy and AGN flux contributions in these high-probability candidates.We demonstrate the robustness and limits of these estimates and then use the resulting AGN flux to infer AGN properties for the sample.

Photometric redshift estimation and initial selection
We implement the first SED modeling step -to find the quasar candidates and separate them from other contaminants, such as low-z galaxies, brown dwarfs, detector artifacts, etc. (e.g., Andika et al. 2020Andika et al. , 2023a) -using eazy-py6 , a Python-based photometric redshift estimator (Brammer et al. 2008).By iterating through a user-defined grid of spectral templates and redshifts, eazy-py tries to find the best model that matches the observed photometry.
Here, the templates for quasar SEDs are derived empirically from the observational data of XMM-COSMOS AGNs and galaxies, provided and discussed in detail by Ananna et al. (2017).Although the original template list includes a wide variety of galaxy types, we exclusively use the spectra of bright quasars showing broad emission lines and a blue rest-frame ultraviolet (UV) continuum for our purposes (e.g., Andika et al. 2023b).As done by Duncan et al. (2021), we further append the effect of dust extinction using attenuation levels (A V ) ranging from 0 to 2 with a step of 0.2, following the model from Calzetti et al. (2000).
Also, the built-in templates for inactive galaxies provided by eazy-py are constructed based on the Flexible Stellar Population Synthesis code (FSPS; Conroy et al. 2009Conroy et al. , 2010;;Conroy & Gunn 2010) and one high-equivalent-width galaxy from Erb et al. (2010).These SEDs contain a mixture of stellar, nebular, and dust-reprocessed emission components.It is important to note that, since young, high-z galaxies could show very blue UV continuum slopes due to their high star formation rate (SFR), lower metallicity, and less dusty nature, we put to use additional bluer templates from Larson et al. (2022) complementing the available SED models.That is, we use the "reduced Lyα" sets in Larson et al. (2022), optimized to fit galaxies at 4 ≤ z ≤ 7. The set of main sequence stellar SEDs is taken directly from the PHOENIX stellar library, encompassing a wide range of spectral types, luminosities, and effective temperatures (Husser et al. 2013).In line with that, brown dwarf spectra are obtained from the Sonora models, covering diverse properties of self-luminous extrasolar planets along with type L, T, and Y brown dwarfs (Marley et al. 2021).After compiling the required SED models, we define a redshift grid of 0.05 ≤ z ≤ 20 with ∆z = 0.05 step and distribute the quasar and galaxy templates accordingly.Furthermore, we create one additional grid for the galaxy template, forcing the redshifts to be z ≤ 5.5 to ensure that our candidates are distinct from the low-z sources.After that, depending on the z values, intergalactic medium (IGM) attenuation is applied following the analytical equation proposed by Inoue et al. (2014).In contrast, we set the redshifts to be close to zero for the star and brown dwarf models.
Each quasar candidate will be modeled with four classes of templates: quasar, galaxy, star, and low-z source.The likelihood of the source being a quasar is subsequently determined by comparing its χ 2 divided by the number of bands employed in the SED fitting (hereafter χ 2 n ).To be exact, we define the goodnessof-fit for quasar, galaxy, star, and low-z source model as χ 2 n, q , χ 2 n, g , χ 2 n, s , and χ 2 n, lz , respectively, and compare their values.Examples of the resulting SED fit and the image cutouts are displayed in Figure 1.A preliminary quasar selection is then performed utilizing the criteria as follows: 1. Detections in four NIRCam bands (F115W, F150W, F277W, and F444W) with more than 5σ.These bands cover the region redward of the expected Lyα emission at z ≳ 6. 2. S/N < 3 in the optical bands blueward of the anticipated Lyα break.Precisely, we use both the HST/ACS F435W and F606W bands for the JADES, UNCOVER, CEERS, and PRIMER datasets, while Subaru/HSC g and r filters are utilized for the COSMOS-Web sources.3. The best-fit model for the observed SED is not a star but either a galaxy or a quasar with the inferred χ 2 n, g and χ 2 n, q values being < 10.Here, we do not require the candidates to be best fitted by a pure quasar model since their host galaxy emission could dominate the observed SEDs in the lower luminosity regimes, as found in most of JWST-confirmed, z ≳ 5 faint AGNs to date (see, for example, Harikane et al. 2023;Maiolino et al. 2023a;Greene et al. 2023).4. The source is located at high-z, indicated by the estimated photometric redshift being z phot > 6, both for the galaxy and quasar models.5.The integrated redshift probability at z > 5.5 should be more than 90%, that is, P(z > 5.5) > 0.9.
Of all sources identified in the combined catalogs, 1370 targets pass our initial criteria and are then visually inspected.More than half of these candidates are cosmic rays, hot pixels, stray light residues, contaminated by nearby bright sources, moving objects, or other detector artifacts.During the visual inspection stage, we also discard candidates with extended morphologies, as they could be low-z dusty sources not visible in the groundbased and HST imaging.Sources with compact shapes with circularized diameter less than 0 ′′ .5 are preferred because their light is likely dominated by a centralized emission component around the galactic nucleus.
To identify sources that have already been spectroscopically confirmed and published in the literature, we cross-match our candidates to the DJA's JWST sources repository 7 and the 7 https://dawn-cph.github.io/dja/general/jwst-sourcesSIMBAD Astronomical Database8 (Wenger et al. 2000).Correspondingly, the current datasets that we have contain at least 36 confirmed broad-line AGNs at 4 ≲ z ≲ 9 reported in the literature, for which 11 of them are located at z ≳ 6 (Harikane et al. 2023;Maiolino et al. 2023a;Larson et al. 2023;Kocevski et al. 2023;Übler et al. 2023;Kokorev et al. 2023;Stone et al. 2023).Our z phot estimates for these sources agree with the redshifts derived via spectroscopy considering the estimated uncertainties, indicating a good performance of our SED fitting with eazy-py.As mentioned before, most of these sources (≈80%) prefer bestfit SEDs based on the galaxy spectral templates, given the substantial brightness of their host galaxy emission, while the remaining objects opt for the pure quasar models.Also, these confirmed AGNs often show compact shapes consistent with our selection criteria.After discarding spurious sources and already published high-z galaxies in the literature, our final selection yields 350 remaining quasar/galaxy candidates (see Table 1).
It is noteworthy to mention that the redshift calculated via broadband photometry can exhibit a systematic deviation from the one based on spectroscopy, which we refer to as a systematic offset bias (e.g., Carrasco Kind & Brunner 2013; Nishizawa et al. 2020).This bias is quantified as ∆z = (z phot − z spec )/(1 + z spec ).For a subset of 27,856 confirmed AGN/galaxies with available spectroscopic data from the DJA's JWST sources repository, on which we applied our SED modeling, we find that the average bias is ⟨∆z⟩ = −0.05while its standard deviation is σ = 0.46.The outlier fraction, defined as the fraction of sources with |∆z| > 0.15, is 23%.When we focus on the subset that meets the high-z source criteria outlined in the preceding paragraphs, the corresponding statistics shift to ⟨∆z⟩ = 0.01, σ = 0.03, and an outlier fraction of 6%.While there is a noticeable scatter in the accuracy of z phot , these results are already sufficient to distinguish between low-and high-z sources, with contamination rates of roughly 5%-25% (see Figure 2).The region with darker colors corresponds to a higher number of sources within the 2D histogram bins.We also show the metrics for a subset that satisfies our high-z selection criteria (white circles with error bars).Samples of spectroscopically confirmed AGNs from the literature are depicted with red circles.The total model spectrum (black) in the observed-frame wavelengths, corrected for the IGM attenuation, is composed of stellar (yellow), dust, and AGN (orange) emissions.These decomposed components are shown without adding the IGM absorption model.We also report the reduced chi-square value (χ 2 red ), fraction of AGN flux to the total emission ( f AGN ) within rest-wavelengths of 0.1-0.7 µm, photometric redshift (z phot ), dust extinction coefficient (A V ), host galaxy stellar mass (M * ), and star formation rate (SFR).The lower part of each panel displays the relative residual between the data and the model.Sources that are better modeled with no AGN contribution have f AGN ≤ 0.05, while the ones selected as quasar candidates exhibit f AGN ≥ 0.2.

Measurements of the galaxy properties
After robustly identifying a sample of high-z galaxy and AGN candidates, which should have few interlopers or spurious members, we carry out complementary SED modeling to extract galaxy and AGN parameters from the broad-band SEDs and will also robustly estimate AGN contributions in this sample.We treat this high-confidence candidate sample as a sample of actual high-z galaxies with variations in AGN contribution between 0% and 100%.We discuss the validity of this approach in the following sections.
We model the SEDs using the Code Investigating GALaxy Emission (CIGALE; Boquien et al. 2019;Yang et al. 2020Yang et al. , 2022) ) package.Following the default configuration as a reference, we consider a delayed star formation history (SFH) with an efolding time range of 0.1 ≤ τ ≤ 5 Gyr and a recent burst, assuming Bruzual & Charlot (2003) stellar population models along with a Chabrier (2003) initial mass function (IMF) and stellar metallicity of Z = 0.02.Next, nebular emission is approximated using the Inoue (2011) model, while the dust extinction is added utilizing the combined Calzetti et al. (2000) and Leitherer et al. (2002) attenuation laws, dubbed as the modified starburst module in the CIGALE setup.We set the E(B − V) color excess for both nebular lines and stellar continuum to be between 0.05 and 2.65, which is equivalent to dust attenuation levels of A V ≈ 0.2-8.2,assuming a ratio of total-to-selective extinction of R V = A V /E(B − V) = 3.1.This wide range of attenuation levels is chosen since z ≲ 5 dust-enshrouded star-forming galaxies could appear as if they were sources at extremely high redshifts (Zavala et al. 2023;Meyer et al. 2023).
The ionization parameter, gas metallicity, and electron density of nebular lines are fixed to log U = −2, Z gas = 0.02, and n e = 100 cm −3 , respectively.We acknowledge that opting for this choice could introduce an additional uncertainty of up to 5% on the inferred AGN-to-host galaxy flux ratio, along with ∼0.1 dex in the measurements of stellar mass.However, its impact on the accuracy of photometric redshift estimations is observed to be minimal.We also note that U, Z gas , and n e display higher sensitivity to altering the emission line strengths and lower sensitivity to modifying the continuum shape, indicating broadband photometry data alone, as we used here, would not be enough to constrain them well (Kaasinen et al. 2017;Kewley et al. 2019).Given the considerations, the introduced tradeoff is acceptable for achieving a simpler model with significantly faster computation times.
Since we are also interested in assessing how much the AGN emission contributes to the observed total fluxes, we make use of the Skirtor2016 model provided by CIGALE on top of the previous SED sets (Stalevski et al. 2012(Stalevski et al. , 2016)).This 3D radiative transfer AGN model includes the accretion disk emission on the UV/optical side and the torus plus polar dust emission at infrared (IR) wavelengths.In addition, the adopted AGN inclination angle could affect the resulting AGN class, namely, obscured or unobscured.Accordingly, we set this as a free parameter to cover both types.Following that, the IGM attenuation effect is appended as a function of redshift following the formula from Meiksin (2006).Finally, the SED models consisting of the galaxy and AGN components are fitted within redshift bins of 0.05 ≤ z ≤ 16 using a step size of ∆z = 0.05.The CIGALE input file will be provided as supplementary data with this paper for reader reference and accessibility.We refer to the CIGALE docu- mentation9 for detailed information on all the spectral templates adopted here (Boquien et al. 2019;Yang et al. 2022).
Examples of the best-fit SED model made with CIGALE are portrayed in Figure 3. Correspondingly, the current SED modeling yields posterior distributions of some physical parameters (see Figure 4), such as the AGN fraction of the total emission ( f AGN ), host galaxy stellar mass (M * ), and SFR averaged over 100 Myr, along with z phot and A V .It should be emphasized that f AGN is calculated considering only the rest-frame wavelengths from 0.1 to 0.7 µm, which is the region constrained by our ground and space-based data while excluding ALMA submillimeter measurements.Furthermore, this wavelength range covers essential broad emission lines in the quasar SED, such as Lyα, Hβ, and Hα.More details on the generated CIGALE output parameters are discussed in (Boquien et al. 2019).Discussion on the inferred physical characteristics of our quasar candidates will be showcased in the next section.

List of quasar candidates and their number density
Up to this stage, we have selected 350 candidates of high-z compact sources via our initial photometric cut, visual inspection, and advanced SED modeling with two independent codes.There will be unresolvable mismatches between observed SEDs and template inputs used for both modeling methods.Hence, there is space for nominal AGN components formally compensating for such template mismatch, even for fully nuclear-passive galaxies.For the subsequent analysis, we will use a threshold in formal AGN fraction to mitigate this.
We will only consider candidates with f AGN ≥ 0.2 to ensure the presence of actual AGN contribution to the observed emission.This threshold level is motivated by a comparative analysis of properties between active and inactive galaxies compiled from the literature, as elaborated in Appendix A. Overall, we anticipate that this cutoff will yield a completeness of approximately 80% in AGN selection, accompanied by a contamination rate as high as 30% from normal high-z galaxies.We further impose a black hole mass (M BH ) limit criterion, where M BH > 10 5 M ⊙ since confirming the quasar nature below this limit is challenging for numerous reasons.For instance, the bright host galaxy emission might dilute the quasar light, making the quasar signature hidden from the observers in the optical to NIR regimes (e.g., Fitriana & Murayama 2022).Furthermore, given the limitation of current observing facilities and the fact that these less massive quasars might only be capable of exhibiting Hα with a line width of ≲ 100 km s −1 , they will be hard to differentiate from the low-velocity outflows or the narrow-line emissions of their host galaxies (Maiolino et al. 2023b).Details on M BH estimation will be discussed later in Section 4.2, but in the end, 64 sources passed these AGN fraction and mass limit criteria of the 350 parent candidates.
As a further note, out of the 11 previously confirmed AGNs at redshifts z ≳ 6 reported by other studies (i.e., Harikane et al. 2023;Maiolino et al. 2023a;Larson et al. 2023;Kocevski et al. 2023;Übler et al. 2023;Kokorev et al. 2023;Stone et al. 2023), 9 sources met our selection criteria (see TableA.1).These known AGNs have intentionally been excluded from the final sample of the 64 quasar candidates presented here.These selected sources -that is, our final quasar candidates -are then marked as grade A while the unselected ones are labeled with grade B. All of our candidates are listed in Table B.1 of Appendix B, which contains information on their coordinates, photometry, and derived properties.Due to the file size constraints, the full table and figures containing the SED fitting results of each source will be exclusively available for online access.
The sky coverages of each survey in the current datasets, for illustration, are approximately 0.28 deg 2 , 57 arcmin 2 , and 49 arcmin 2 for COSMOS-Web, JADES, and UNCOVER, respectively (see Table 1).Consequently, within the COSMOS-Web field and adopting the luminosity function of Harikane et al. (2023), we expect to find around 18 quasars at z = 6-8 having the UV absolute magnitudes of M UV ≲ −21.With their deeper imaging, JADES and UNCOVER might recover about 12 and 23 sources brighter than M UV ≈ −19, respectively.Thus, the number of quasar candidates we found seems reasonable since it is within the appropriate range of the empirical predictions.We note that the luminosity function of Harikane et al. (2023) is derived based on the recent census of z ≈ 4-7 low-luminosity AGNs (−18.5 ≳ M UV ≳ −21.5) detected with the JWST observations.In contrast, if we take and extrapolate the models from Matsuoka et al. (2018) or Schindler et al. (2023) into the fainter regimes, for which they were anchored initially to the bright (M UV ≲ −22), unobscured quasar population at z ∼ 6, we anticipate finding only one source in each field.
We present the number density of our quasar candidates in Table 2 and Figure 5. Here, we consider a redshift range of z = 6.0-8.4,and the total solid angle covered by our datasets is around 0.45 deg 2 , which corresponds to a survey volume of approximately 8.2 × 10 6 Mpc 3 .The M UV for our candidates are calculated from the flux observed at the rest-frame wavelength of 1500 Å, derived based on our best-fit total SED model.Hence, the reported M UV accounts for the total emission from the quasar plus its host galaxy component.Accordingly, to construct the UV luminosity function, we perform 10 4 Monte Carlo draws of our quasar candidates, incorporating their observed M UV and z phot along with their associated uncertainties.These random draws are necessary for instances where sources may fall outside the predefined redshift range or get counted in different magnitude bins across various iterations.Given that the quasar count depends on the chosen f AGN threshold, we also vary this crite-rion from f AGN ≥ 0.2 to 0.9 to take into account additional errors resulting from our selection method.Our error estimation also accounts for the possible presence of low-z interlopers and inactive galaxies with a contamination rate of up to 30% (see, for example, Figure 2 and Appendix A).We caution that the resulting number density estimation has not been adjusted for survey incompleteness.2023) is designated with orange circles, and their double power-law model, along with its uncertainty, is portrayed with an orange dashed line and shaded region.We also show the AGN luminosity function at z = 4-6 reported by Maiolino et al. (2023a) with purple circles for comparison.The number density of our quasar candidates is higher than the extrapolation of the bright QLF.Nevertheless, it is consistent with the X-ray selected AGN luminosity function (XLF) from Parsa et al. (2018) and Giallongo et al. (2019), which are denoted as dashed and dotted gray lines.
In general, the number density of our quasar candidates exceeds the extrapolated values of the brighter quasar population luminosity function by a factor of ≈10 (e.g., Matsuoka et al. 2018;Schindler et al. 2023), as shown by the blue line in Figure 5. On the other hand, our numbers align with those reported by Harikane et al. (2023) to some extent; yet, densities at M UV ≳ −19 are largely uncertain given the source faintness and potential incompleteness in our quasar search method.Interestingly, our samples are consistent with the faint, X-ray-selected AGN luminosity function presented by Parsa et al. (2018) and Giallongo et al. (2019).The different nature of the bright and faint quasar populations might cause a large discrepancy between the luminosity functions mentioned earlier.At the same time, many of these faint sources are just being detected with JWST, and it is likely that much remains to be revealed.Below, we will discuss the constraints on the black hole and host galaxy characteristics of our quasar candidates.

Black hole and host galaxy masses
The distribution of the central black hole mass to the host galaxy's stellar mass ratio -that is, M BH /M * -is a tracer of the supermassive black hole (SMBH) formation history (Volonteri 2012).We want to again treat our high-probability quasar candidates as actual quasars and, under that assumption, infer black hole and stellar masses for them.To estimate M BH , we first adopt the canonical normalized accretion rate parameterized by the Eddington ratio, f Edd ≡ L bol /L Edd , where L bol and L Edd are the bolometric and Eddington luminosities, respectively (e.g., Wu & Shen 2022).In this case, L bol is calculated by multiplying a bolometric correction factor of 5.15 (Richards et al. 2006) with the monochromatic luminosity at the rest-frame wavelength of 3000 Å-that is, L 3000 = λL λ (3000 Å) -derived based on our best-fit AGN SED model obtained in Section 3.2.Then, we derive the lower limit M BH of our quasar candidates, assuming f Edd = 1 and considering that Eddington luminosity can be approximated using: (1) As the values derived here represent lower limits, the true M BH could potentially be significantly higher.A comparison between our SED-based M BH and those determined through broad emission line spectroscopy reveals an actual M BH that is ≈1.6 dex higher, as demonstrated in Appendix A. The observed offset is anticipated, given the significant influence of f Edd on our M BH estimates.Adjusting the assumed f Edd to a much lower value, such as 0.1, results in a 1 dex increase in our data points, bringing them closer to spectroscopic M BH values.
The inferred M BH /M * distribution of our quasar candidates inferred from Equation 1 and Section 3.2 is displayed in Figure 6.This distribution assumes that our quasar candidates may exhibit f Edd values ranging from 0.1 to 1 and includes that uncertainty.While our quasar candidates display a M * -M BH distribution slightly higher than that of galaxies at z ∼ 0 (e.g., Kormendy & Ho 2013;Reines & Volonteri 2015), with properties consistent with observed samples of other high-z low-luminosity AGNs (e.g., Harikane et al. 2023;Kocevski et al. 2023), we emphasize that the derived M BH values represent lower limits.
Luminous quasars hosting massive black holes tend to reside within galaxies with larger stellar masses, the M BH to M * ratios show a large diversity (see, e.g., Inayoshi et al. 2020;Fan et al. 2023).For instance, z ≳ 5 bright quasars examined by Yue et al. (2023) display M BH /M * reaching as high as 10%, which is significantly more prominent compared to the sources in the nearby Universe (e.g., Kormendy & Ho 2013).On the other hand, less luminous objects, such as samples of 4 ≲ z ≲ 7 AGNs from Harikane et al. (2023) are characterized by relatively lower M BH /M * ∼ 1%.To add further support of this diversity, Larson et al. (2023) reported a presence of a broad-line AGN at z = 8.679 exhibiting an M BH /M * ≈ 0.3%, while, conversely, Furtak et al. (2023) presented an AGN at z = 7.045 having M BH /M * ≳ 3%.Here, we need to note that for all samples, there are strong selection effects at play (e.g., Li et al. 2022), biasing against the ability to see low-luminosity AGN in bright galaxies.The exact impact will depend on the selection method but might imply limits by SED preselection, color-color cuts, emission-line strengths, or -as for our approach -a minimal required AGN fraction of the total flux.What all methods have in common is that they will preferentially find massive SMBHs.With that in mind, the comparison mentioned above implies that the growth of SMBHs at the upper envelope of these actually selected bright quasars may have preceded the star formation in their host galaxies (Kokorev et al. 2023;Maiolino et al. 2023b;Pacucci et al. 2023).
Whether the M BH -M * relation evolves with redshift is still a subject of debate.For example, Caplar et al. (2018) proposed an increasing SMBH to host mass ratio at higher redshifts, that is, M BH /M * ∝ (1 + z) 1.5 , which was inferred using an analytical approach to obtain the M BH -M * relation that fits the observed quasar luminosity function and SFR density (see also Pacucci & Loeb 2024).On the other hand, considering various observable SMBH and host galaxy properties, including mass functions and quasar distributions, Zhang et al. (2023) demonstrated that there is no significant evolution of M BH -M * up to z ∼ 10 (see also, for example, Suh et al. 2020;Ding et al. 2020;Li et al. 2021).In addition, in flux-limited surveys, quasars harboring overmassive black holes -e.g., M BH /M * > 0.01 -could dominate the pickedup samples due to selection effects (Lauer et al. 2007).As seen in Figure 6, luminous quasars investigated by Yue et al. (2023), Übler et al. (2023), andStone et al. (2023) lie way above the local M BH -M * relation, indicating a potential bias mentioned earlier.This bias might occur because larger SMBH masses could produce higher quasar luminosities, which are more accessible to locate in flux-limited observations.

Possible pathways for SMBH growth
The significant diversity observed in the most distant SMBHs and their host galaxies might suggest a range of distinct growth histories and progenitors, which we will discuss further here.While the exact seeding mechanisms remain elusive, it is generally accepted that early SMBHs might originate from at least two types of progenitors: (i) light seeds arising from the remnants of Population III stars having masses of ≈10-100 M ⊙ and (ii) heavy seeds with a mass range of 10 4 -10 6 M ⊙ produced by the collapse of primordial gas or dense star clusters (Inayoshi et al. 2020).
Here, we want to trace back the growth of our quasar candidates following the method presented by Pacucci & Loeb (2022).As the first step, we describe the connection between the initial seed mass M seed and the accumulated black hole mass M BH at a specific cosmic time t using the relation: In this case, t Edd is a mixture of constants where its typical value is ≈450 Myr (Pacucci & Loeb 2022), f Edd is the average Eddington ratio across the accretion time interval of ∆t, and ϵ is the mean radiative efficiency over the ∆t.The time interval is expressed as ∆t = t − t seed , where t seed corresponds to the black hole Other JWST AGNs  2023) are indicated with cyan circles.The gray dots and crosses are nearby galaxies and AGNs from Kormendy & Ho (2013) and Reines & Volonteri (2015).The black dashed lines mark the limits where M BH /M * equals 0.1 and 0.01.Our candidates show a slightly higher M BH to M * ratios than other galaxies at z ∼ 0 with consistent properties compared to high-z low-luminosity quasars.
seeding epoch.Assuming that a black hole seed is assembled at z = 25, this would equal a cosmic time of t seed ≈ 130 Myr.
For an accretion mode following the thin disk model, ϵ ranges from 0.34 down to 0.057, depending on whether the central black hole is maximally rotating or nonrotating (Fabian & Lasenby 2019;Pacucci & Loeb 2020;Ananna et al. 2020).Then, the fraction of the quasar lifetime for which the accretion occurs is parametrized with the duty cycle D. Unfortunately, f Edd and D are degenerate, meaning that one can obtain an identical value of M BH by combining different values of both growth parameters.Due to this reason, we assume that the sources are actively accreting throughout their entire lifetime so that D can be set to unity for simplicity.We subsequently simulate the SMBH growth using a simple Monte Carlo strategy with 2000 realizations, exploiting Equation 2 as the target function.Starting from smaller seeds, we aim to match the masses of our quasar candidates that are, on average, within M BH = 10 5 -10 8 M ⊙ , at a median redshift of z = 6.7.Three essential parameters control our growth model, that is, M seed , f Edd , and ϵ.Correspondingly, we adopt a flat prior of log M seed ∈ [1, 6] M ⊙ , covering both the light and heavy seed mass regimes, and adjust the radiative efficiency in a physical range of ϵ ∈ [0.057, 0.34].Considering that (i) many high-z quasars detected so far are showing instantaneous accretion rates below or around the Eddington limit (e.g., Fan et al. 2023) and (ii) super-or hyper-Eddington accretion periods are typically short-lived (∆t ∼ 0.1 Myr), we adopt the Eddington ratio to be uniformly distributed within f Edd ∈ [0, 1] (e.g., Fragione & Pacucci 2023).While M seed is constant since the black hole is only seeded one time, ϵ and f Edd , on the other hand, may change over the period between the seed formation until it is detected at a later time.Thus, these two parameters should be viewed as averages over the quasar lifetime.
The combination of parameters that permits the formation of the central SMBHs we assume are residing in our quasar candidates is presented in Figure 7.At the same time, the as-sociated growth track is provided in Figure 8.The majority of bright quasars from Fan et al. (2023) occupy the region where M BH ≳ 10 8 M ⊙ while our candidates, as well as some JWSTconfirmed AGNs, reside in the lower mass side.(e.g., Goulding et al. 2023;Übler et al. 2023;Stone et al. 2023;Larson et al. 2023;Kocevski et al. 2023;Harikane et al. 2023;Maiolino et al. 2023a;Greene et al. 2023).Larger seeds with M seed > 10 4 M ⊙ seem to be the preferred progenitor to develop these SMBHs by z ≈ 7.In particular, most of our quasar candidates might have arisen from the black hole seeds as big as M seed ∼ 10 5 M ⊙ , assuming the values of f Edd = 0.6 ± 0.3 and ϵ = 0.2 ± 0.1.If short super-Eddington episodes occur during their evolution, the required progenitor mass could be lower, indicating that dense star cluster seeds could also be the ancestors of our quasar candidates.Distinguishing between the formation through direct collapse black hole or dense star cluster channels is complicated, given the necessity of more precise measurements of the SMBH and host galaxy masses along with the gas metallicity, denoting that extra spectroscopic data are needed (Volonteri et al. 2023).After that, we run similar modeling as a comparison, but now targeting the bright quasars with M BH ≈ 10 8 -10 9 M ⊙ compiled by Fan et al. (2023).As a result, this population also gives preference for heavy seeds with the Eddington ratio pushed higher to f Edd = 0.78 ± 0.17 and radiative efficiency going down to ϵ = 0.09 ± 0.03.We note that the parameter space occupied by this population is tighter than our less luminous quasar candidates, showing that detecting larger SMBHs at the farthest accessible distances could shrink the viable growth parameters and modes significantly.Furthermore, our simple calculation confirms that as long as the radiative efficiency is at the lower end of the range accommodated by the thin disk model and the ac-cretion is not dominated by super-Eddington episodes, it is less likely to yield SMBHs from the light seeds.
Maturing the light seeds in a short amount of time is complicated as this process would require the growth dominated with the Eddington-limited ( f Edd = 1) or even super-Eddington ( f Edd > 1) accretion to match the z ≈ 6-7 quasar mass distribution.However, assembling such enormous masses and sustaining high accretion rates will be challenging, given the intricacy created by the enhanced stellar feedback from the host galaxies.For example, a vast number of supernova explosions will happen during the rapid mass build-up, resulting in intense heating and mixing of the gas, making the accretion inefficient and more likely to be sub-Eddington (Larson et al. 2023).The only way to develop the light seeds into SMBHs is probably to adopt a hypothetical slim disk scenario, which lowers the radiative efficiency to ϵ = 0.04 (Abramowicz et al. 1988;Mineshige et al. 2000;Pacucci et al. 2015;Volonteri et al. 2015).With just a mild accretion of f Edd = 0.3, for example, this channel could already produce ≳ 10 6 M ⊙ black holes by z ≈ 7 as shown in Figure 8. Whether this mass accumulation channel dominates the high-z quasar population is still debatable.Therefore, further study to understand the typical accretion mode and the interplay between the growth parameters of early black holes is vital to constrain the evolution of these intriguing sources.Fig. 8. Black hole mass growth as a function of redshift.The red contour represents the expected lower limit masses of the quasar candidates identified in this study.Additionally, the typical photometric redshift uncertainty for these candidates is illustrated in the lower right corner of the panel.Green circles depict the bright quasar samples from Fan et al. (2023) while blue crosses display broad-line AGNs from the literature that have been observed with JWST spectroscopy (see main text).The cyan-shaded region shows the mass range of different progenitors.The solid black line and region show an evolutionary track along with its posterior distribution, assuming a thin disk accretion and heavy seed progenitors.On the other hand, cases where we use a thin disk accretion at the Eddington limit and a slim disk model to grow light seeds into SMBHs are shown with dash-dotted and dashed black lines.

Summary and conclusion
We have presented 350 candidates of compact galaxies, of which 64 show a high probability of being quasars at z ≳ 6, selected by exploiting the rich multiband dataset of COSMOS-Web, as well as the JADES, UNCOVER, CEERS, and PRIMER projects.These surveys consist primarily of JWST/NIRCam observations.The subsequent photometric catalog creation incorporated ancillary data from HST and other ground-based surveys.Accordingly, our search strategy consists of two primary steps: photometric cut on catalog-level information and SED fitting to separate the quasars from other contaminating sources.While the initial goals of the SED fitting are to classify and estimate the photometric redshift of each candidate, we also assess their associated physical properties under the assumption that they indeed are quasars, including the SMBH and host galaxy's stellar masses, as well as the fraction of AGN emission.
Our quasar candidates exhibit features consistent with the low-luminosity AGN population, where they potentially host less massive SMBHs with M BH ≈ 10 5 -10 8 M ⊙ residing in galaxies having M * ≈ 10 8 -10 10 M ⊙ .Furthermore, these sources display M BH -M * distribution that is slightly higher than those of galaxies at z ∼ 0 (e.g., Kormendy & Ho 2013;Reines & Volonteri 2015), or in other words, their SMBHs tend to be overmassive compared to their hosts.However, we stress that all quasars identified in these surveys are naturally biased to high M BH /M *ratios.This means they are not representative of the underlying population but preferentially form the upper envelope of the distribution.
With this in mind, we then run a simple Monte Carlo simulation to explain how these SMBHs accumulate their mass by the time they are detected.Larger seeds from the direct collapse scenario, with M seed > 10 4 M ⊙ , appear to be the favored origins to develop these SMBHs by z ∼ 7. Notably, most of our quasar candidates might have emerged from the black hole seeds as large as M seed ∼ 10 5 M ⊙ , considering the values of f Edd = 0.6 ± 0.3 and ϵ = 0.2 ± 0.1 -that is, the Eddington limited accretion in thin disk model.If brief super-Eddington events arise during their growth, the required progenitor mass could be smaller, implying that dense star cluster seeds could also be the ancestors of our quasar candidates.
As we have offered the most promising and robust highz quasar candidates in this paper, further confirmation is vital to uncover their true nature.For example, spectroscopy with JWST would be the best opportunity to acquire the rest-frame UV/optical spectra of these quasars, allowing the detection of broad emission lines to get more precise SMBH mass measurements and gas-phase metallicity.In addition to that, we can probe the cold molecular gas, tracing the galaxy dynamics and star formation activity, with ALMA.With all of that being said, the samples presented in this work are ideal laboratories for dissecting the nature of the first galaxies and SMBHs formed during the reionization era.
quasar candidates compared to confirmed AGNs and inactive galaxies in Figure A.2. Since a substantial overlap between the colors of unobscured AGNs and galaxies is observed, employing a more advanced technique, such as full SED fitting as done here, is more effective than using simple color cuts for accurately identifying these blue quasars.In the future, complementing our current datasets with more mid-infrared (MIR) measurements will be instrumental in differentiating AGNs from star-forming galaxies.This distinction arises from the fact that the presence of hot dust emission in MIR bands is a unique feature not easily attributable to stellar light or cold dust within the interstellar medium.Galaxy samples from the DJA's JWST sources repository at low, medium, and high redshifts are marked with gray crosses, orange empty circles, and orange-filled circles, respectively.Broad-line AGNs are indicated with blue colors, where filled symbols denote objects at 6 < z < 9, while empty symbols show those at 4 < z < 6 (see the figure legend).On the other hand, samples of our quasar candidates existing in the same extragalactic fields are portrayed with red circles.Substantial overlap between the colors of unobscured AGNs -that is, those with blue rest-frame UV continuum -and galaxies make it challenging to separate them using simple color cuts, indicating that full SED fitting is a better way to recover those blue quasars.Notes.Column (1): name or identifier of each source.Column (2)-(3): right ascension (RA) and declination (Dec) in decimal degrees.Column (4): spectroscopic redshift.Column (5): fraction of AGN component to the total emission within the rest-wavelengths of 0.1-0.7 µm.Column (6): lower limit black hole mass assuming an accretion at Eddington limit.Column (7): total stellar mass calculated following the method presented in this work.Column (8): original literature describing the object.Empty columns indicate that the data is unavailable in the public JWST datasets, and the corresponding source is not used to benchmark our quasar selection method.

Fig. 1 .
Fig. 1.Photometric SEDs of some quasar candidates.In the left part of each panel, we model the observed photometric data points with four types of spectral templates.Fluxes with S/N > 3 are marked with red circles, while those with lower values are shown with a bit transparent color.The best-fit quasar template and its associated synthetic photometry are presented with blue lines and circles.Models based on the galaxy and star/brown dwarf spectra are shown with magenta and yellow lines, with an additional fitted model of low-z sources displayed in gray color.The goodness-of-fit of each model and the estimated redshift are also reported (see main text for a detailed definition).We further indicate the calculated redshift probability distribution function, P(z) for quasar and galaxy models.Finally, the right part of each panel shows the multiband images of the z ≳ quasar candidate, each trimmed to 6 ′′ size on a side.

Fig. 2 .
Fig. 2. Comparison between z phot and z spec .The number count (N), average bias (|∆z|), scatter (σ), and outlier fraction (|∆z| > 0.15) of all sources (blue squares) with available spectroscopic data are reported.The region with darker colors corresponds to a higher number of sources within the 2D histogram bins.We also show the metrics for a subset that satisfies our high-z selection criteria (white circles with error bars).Samples of spectroscopically confirmed AGNs from the literature are depicted with red circles.

Fig. 3 .
Fig.3.Examples of SED fitting with CIGALE with AGN plus galaxy components.Observed and upper limit fluxes are shown in the upper part of each panel with red dots and triangles with error bars, respectively.The total model spectrum (black) in the observed-frame wavelengths, corrected for the IGM attenuation, is composed of stellar (yellow), dust, and AGN (orange) emissions.These decomposed components are shown without adding the IGM absorption model.We also report the reduced chi-square value (χ 2 red ), fraction of AGN flux to the total emission ( f AGN ) within rest-wavelengths of 0.1-0.7 µm, photometric redshift (z phot ), dust extinction coefficient (A V ), host galaxy stellar mass (M * ), and star formation rate (SFR).The lower part of each panel displays the relative residual between the data and the model.Sources that are better modeled with no AGN contribution have f AGN ≤ 0.05, while the ones selected as quasar candidates exhibit f AGN ≥ 0.2.

Fig. 5 .
Fig. 5. UV luminosity functions of sources at z ∼ 6.The number densities of our quasar candidates as a function of UV absolute magnitude are marked with red circles with error bars.Blue circles and squares represent the data from Matsuoka et al. (2018) and Schindler et al. (2023), where their associated best-fit quasar luminosity function (QLF) is shown with a blue line.The fitted model and observed galaxy luminosity function (GLF) from Bouwens et al. (2021) are displayed with a green line and circles, respectively.A sample of JWST-confirmed AGNs from Harikane et al. (2023) is designated with orange circles, and their double power-law model, along with its uncertainty, is portrayed with an orange dashed line and shaded region.We also show the AGN luminosity function at z = 4-6 reported byMaiolino et al. (2023a) with purple circles for comparison.The number density of our quasar candidates is higher than the extrapolation of the bright QLF.Nevertheless, it is consistent with the X-ray selected AGN luminosity function (XLF) fromParsa et al. (2018) andGiallongo et al. (2019), which are denoted as dashed and dotted gray lines.

Fig. 6 .
Fig. 6.Relation between the black hole mass (M BH ) and its host galaxy stellar mass (M * ).The red contour represents our quasar candidates at z ≳ 6, where our measurements can only provide lower limits for M BH , considering Eddington ratio values ranging from 0.1 to 1.The typical statistical errors for M * are indicated in the lower right corner of the panel.High-z quasar samples with available JWST spectroscopic data from Harikane et al. (2023), Yue et al. (2023), Ding et al. (2023), and Maiolino et al. (2023a, excluding dual AGNs) are presented with blue, green, orange, and purple circles with error bars.Additional AGN samples from Larson et al. (2023), Übler et al. (2023), Stone et al. (2023),Kocevski et al. (2023),Kokorev et al. (2023), andGoulding et al. (2023) are indicated with cyan circles.The gray dots and crosses are nearby galaxies and AGNs fromKormendy & Ho (2013) andReines & Volonteri (2015).The black dashed lines mark the limits where M BH /M * equals 0.1 and 0.01.Our candidates show a slightly higher M BH to M * ratios than other galaxies at z ∼ 0 with consistent properties compared to high-z low-luminosity quasars.

Fig. 7 .
Fig. 7. Combination of M seed , f Edd , ϵ values that can produce the observed M BH of high-z quasars.The parameter distributions of our quasar candidates and sources from Fan et al. (2023) are depicted in blue and orange colors, respectively.Assuming a thin disk model and Eddingtonlimited accretion, larger seed masses with M seed > 10 4 M ⊙ are the preferred channel for growing the SMBHs.
Fig. A.2. JWST/NIRCam color diagram of spectroscopically confirmed sources residing in the CEERS, UNCOVER, and GOODS-S datasets.Galaxy samples from the DJA's JWST sources repository at low, medium, and high redshifts are marked with gray crosses, orange empty circles, and orange-filled circles, respectively.Broad-line AGNs are indicated with blue colors, where filled symbols denote objects at 6 < z < 9, while empty symbols show those at 4 < z < 6 (see the figure legend).On the other hand, samples of our quasar candidates existing in the same extragalactic fields are portrayed with red circles.Substantial overlap between the colors of unobscured AGNs -that is, those with blue rest-frame UV continuum -and galaxies make it challenging to separate them using simple color cuts, indicating that full SED fitting is a better way to recover those blue quasars.

Table 1 .
Overview of the employed selection that we used to detect the high-z quasars.Notes.At the end of our search, we found 350 compact sources, including 64 showing attributes consistent with low-luminosity AGNs.We also report the sky area covered by each dataset and the faintest F444W magnitude of the candidates.

Table 2 .
Number density of our 6 ≲ z ≲ 8 quasar candidates.Notes.The columns from left to right are: (1) the UV absolute magnitude bins, (2) the average and standard deviation of the number densities, and (3) the average number of objects obtained from the Monte Carlo draws.The reported numbers are not corrected for possible survey incompleteness.

Table A .
1. Compilation of AGN samples that have been spectroscopically characterized with JWST from the literature.Article number, page 17 of 19 A&A proofs: manuscript no.main Table A.1.continued.