SCUBA-2 Ultra Deep Imaging EAO Survey (STUDIES). IV. Spatial Clustering and Halo Masses of Submillimeter Galaxies

, , , , , , , , , , , , , , , , , , , , , , , , and

Published 2020 June 2 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Chen-Fatt Lim et al 2020 ApJ 895 104 DOI 10.3847/1538-4357/ab8eaf

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/895/2/104

Abstract

We analyze an extremely deep 450 μm image (1σ = 0.56 mJy beam−1) of a ≃300 arcmin2 area in the CANDELS/COSMOS field as part of the Sub-millimeter Common User Bolometric Array-2 Ultra Deep Imaging EAO Survey. We select a robust (signal-to-noise ratio ≥4) and flux-limited (≥4 mJy) sample of 164 submillimeter galaxies (SMGs) at 450 μm that have K-band counterparts in the COSMOS2015 catalog identified from radio or mid-infrared imaging. Utilizing this SMG sample and the 4705 K-band-selected non-SMGs that reside within the noise level ≤1 mJy beam−1 region of the 450 μm image as a training set, we develop a machine-learning classifier using K-band magnitude and color–color pairs based on the 13-band photometry available in this field. We apply the trained machine-learning classifier to the wider COSMOS field (1.6 deg2) using the same COSMOS2015 catalog and identify a sample of 6182 SMG candidates with similar colors. The number density, radio and/or mid-infrared detection rates, redshift and stellar-mass distributions, and the stacked 450 μm fluxes of these SMG candidates, from the S2COSMOS observations of the wide field, agree with the measurements made in the much smaller CANDELS field, supporting the effectiveness of the classifier. Using this SMG candidate sample, we measure the two-point autocorrelation functions from z = 3 down to z = 0.5. We find that the SMG candidates reside in halos with masses of ≃(2.0 ± 0.5) × 1013 h−1 M across this redshift range. We do not find evidence of downsizing that has been suggested by other recent observational studies.

Export citation and abstract BibTeX RIS

1. Introduction

Over the past two decades, a class of far-infrared luminous galaxies has been discovered at submillimeter wavelengths. The extreme infrared luminosities observed in these submillimeter galaxies (SMGs) suggest that they are dusty and considered to be among the most intensively star-forming sources in the universe (Smail et al. 1997; Barger et al. 1998, 1999; Hughes et al. 1998; Eales et al. 1999). SMGs appear to have a redshift distribution peaking at z ≃ 2.5 with the majority of them at z = 1.5–3.5 (Barger et al. 2000; Chapman et al. 2003, 2005; Pope et al. 2006; Aretxaga et al. 2007; Wardlow et al. 2011; Michałowski et al. 2012b, 2017; Yun et al. 2012; Simpson et al. 2014, 2017; Chen et al. 2016; Koprowski et al. 2016; Danielson et al. 2017; Dunlop et al. 2017; Stach et al. 2019; Dudzevičiūtė et al. 2020), occupying the same putative peak epoch of unobscured star formation (Madau & Dickinson 2014) and active galactic nucleus (AGN) activity (Schmidt et al. 1995; Hasinger et al. 2005; Wall et al. 2008; Assef et al. 2011). The total infrared luminosities (LIR; 8–1000 μm) of SMGs are similar to local ultra-luminous infrared galaxies (Sanders et al. 1988; Sanders & Mirabel 1996; Farrah et al. 2001; Armus et al. 2009), reaching values greater than a few times 1012 L or even higher than 1013 L for some of the brightest sources. This corresponds to star formation rates (SFRs) ranging from around 100 M yr−1 to more than 1000 M yr−1 (Hainline et al. 2011; Barger et al. 2012; Swinbank et al. 2014; da Cunha et al. 2015; Simpson et al. 2015; Dudzevičiūtė et al. 2020).

Violent gas accretion, potentially induced by mergers, is the most likely explanation to date for the intensive star formation of SMGs (Frayer et al. 1998; Conselice et al. 2003; Greve et al. 2005; Tacconi et al. 2008; Engel et al. 2010; Swinbank et al. 2010; Alaghband-Zadeh et al. 2012; Menéndez-Delmestre et al. 2013; Chen et al. 2015; Koprowski et al. 2016; Michałowski et al. 2017; Chang et al. 2018). Large amounts of gas accretion can result in a short-lived starburst possibly followed by a quasar phase. Feedback mechanisms from star formation or black hole accretion could have injected sufficient energy to heat the remaining gas or expel it from the galaxy to prevent further star formation (Silk & Rees 1998; Fabian 1999; Trayford et al. 2016). This scenario may be responsible for the formation of the most massive (M* > 1011 M) elliptical galaxies in the local universe (Lilly et al. 1999; Hopkins et al. 2005; Simpson et al. 2014; Toft et al. 2014; Dudzevičiūtė et al. 2020; Rennehan et al. 2020). Therefore, the cosmological evolution of SMGs is crucial for our understanding of the formation of massive galaxies in the universe.

Comparison of clustering measurements with dark matter simulations can provide constraints on the masses of dark matter halos that a given galaxy population resides in (Peebles 1980) and further trace the evolution of the given galaxy population. Previous clustering analyses of SMGs identified in shorter (250–500 μm; Cooray et al. 2010; Maddox et al. 2010; Mitchell-Wynne et al. 2012; van Kampen et al. 2012; Amvrosiadis et al. 2019) and longer (850–1100 μm; Scott et al. 2002; Webb et al. 2003; Weiß et al. 2009; Lindner et al. 2011; Williams et al. 2011; Hickox et al. 2012; Wilkinson et al. 2017; An et al. 2019) submillimeter wave bands have revealed that SMGs reside in high-mass (1012–1013 h−1 M) dark matter halos. These values are also consistent with previous estimates from a sample of obscured starburst galaxies reported by Béthermin et al. (2014), in which they used a combined BzK color criterion and Herschel/PACS data to study the clustering signal of obscured starburst galaxies at 1.5 < z < 2.5 as a function of their physical parameters. These results suggest that SMGs may be the progenitors of massive elliptical galaxies in the local universe (Hughes et al. 1998; Eales et al. 1999; Swinbank et al. 2006; Targett et al. 2011; Miller et al. 2018; Wang et al. 2019). However, many of these previous studies were limited by either the modest samples of SMGs (≃100 sources) or a lack of reliable identifications and redshift measurements, which makes their estimated clustering signals highly uncertain.

More precise determinations of the clustering properties with sizable SMG samples have been performed by Chen et al. (2016), Wilkinson et al. (2017), Amvrosiadis et al. (2019), and An et al. (2019). Chen et al. (2016) identified a sample of ≃3000 faint SMGs (S850 μm < 2 mJy) using a color selection technique and compared their clustering properties with other galaxy populations in the redshift range 1 < z < 5. Wilkinson et al. (2017) performed a clustering analysis using a sample of ≃600 850 μm–selected SMGs in the UKIDSS Ultra Deep Survey field in the redshift interval 1 < z < 3. Amvrosiadis et al. (2019) studied the clustering properties for a sample of ≃120,000 Herschel-selected SMGs with flux densities S250 μm > 30 mJy in low (z < 0.3) and high (1 < z < 5) redshift intervals. An et al. (2019) identified ≃7000 potential 850 μm–selected SMGs in the COSMOS field based on a radio+machine-learning method trained on the Atacama Large Millimeter/submillimeter Array (ALMA)–identified sample (An et al. 2018) and studied their clustering properties.

These aforementioned recent studies were able to measure the clustering signals from SMGs as a function of redshift. Wilkinson et al. (2017) found that the clustering appears to exhibit tentative evolution with redshift, such that the SMG activity seems to be shifting to less massive halos at lower redshifts z = 1–2 and consistent with the downsizing scenario (Cowie et al. 1996; Magliocchetti et al. 2014; Rennehan et al. 2020) that the contribution of luminous sources dominates in the early universe, whereas the growth of the less luminous ones continues at lower redshifts. In contrast, Chen et al. (2016), Amvrosiadis et al. (2019), and An et al. (2019) did not find such a trend, suggesting that SMGs reside in a typical halo mass of about 1013 h−1 M across the redshift range 1 < z < 5. The discrepancies in the lower-redshift bins could be simply caused by the measurement uncertainties (uncertain identifications and/or poor redshift measurements) or by the different methodologies that are adopted in the clustering analyses, where Wilkinson et al. (2017) relied on the cross-correlation technique with an abundant K-band selected sample, while Chen et al. (2016), Amvrosiadis et al. (2019), and An et al. (2019) adopted an autocorrelation technique.

However, these studies did not probe the clustering signals in a key redshift range (0.3 < z < 1.0; cosmic time ranges from 6.0 to 10.4 Gyr) in which the downsizing effect, if it exists, is expected to increase. This is likely caused by the longer-wavelength observations being more sensitive to high-redshift sources. The traditional 850 μm selection allows us to measure clustering at z > 1 (Chen et al. 2016; Wilkinson et al. 2017; An et al. 2019), whereas the studies based on Herschel at shorter wavelengths (e.g., 250 μm; van Kampen et al. 2012; Amvrosiadis et al. 2019) are mainly sensitive to the brightest low-redshift sources (S250 μm > 30 mJy) due to their positive K-correction. Using spectral energy distribution (SED) template fitting on the far-infrared photometry to estimate redshifts for sources without optical counterparts, Amvrosiadis et al. (2019) extended the Herschel-based clustering studies to z > 1, finding results consistent with those obtained from the 850 μm selection. However, the large redshift uncertainties (≃0.3 for z = 1; Amvrosiadis et al. 2019) have the consequence that they cannot meaningfully measure the clustering signals in the low-redshift regime.

Observations at mid-infrared (e.g., 24 μm) can be used to overcome the aforementioned selection bias and probe the clustering signals in the key redshift range of 0.3 < z < 1.0 (Gilli et al. 2007; Magliocchetti et al. 2008; Starikova et al. 2012; Dolley et al. 2014; Solarz et al. 2015). These studies found relatively lower clustering strengths at z < 1 with clustering length r0 = 3–6 h−1 Mpc compared to the high-redshift measurements. This finding is similar to the work of Magliocchetti et al. (2013) based on Herschel/PACS 100 μm–selected sources, in which they found clear evidence for the downsizing effect at redshifts limited to z ≲ 1. However, the clustering lengths also correlate with the infrared luminosities, where the galaxies with higher LIR (higher SFRs) tend to have stronger clustering signals (Dolley et al. 2014; Toba et al. 2017). The majority of the sources in the above studies at 24 and 100 μm are biased toward a fainter population with LIR ≃ 1011 L at z < 1, which prevents us from making a fair comparison with the SMGs at z > 1 that have LIR > 1012 L.

In this paper, we base our analysis on the 450 μm data obtained from the Sub-millimeter Common User Bolometric Array-2 (SCUBA-2; Holland et al. 2013) camera on the 15 m James Clerk Maxwell Telescope (JCMT). The 450 μm observations allow us to obtain photometric measurements closer to the redshifted SED peak of typical SMGs (λrest ≃ 100 μm) so they provide a closer match to far-infrared luminosity selection compared to longer-wavelength observations. The 450 μm observations also allow us to probe the SMGs at lower redshifts (z ≃ 1.5), with the majority of them at z = 0.5–3.0 (Casey et al. 2013; Simpson et al. 2014; Bourne et al. 2017; Zavala et al. 2018; Lim et al. 2020). We have pushed the frontier of the 450 μm imaging by initiating a new SCUBA-2 imaging survey in the CANDELS/Cosmic Evolution Survey (COSMOS; Scoville et al. 2007) field, called the SCUBA-2 Ultra Deep Imaging EAO Survey (STUDIES; Wang et al. 2017). By including all of the archival data, we have constructed an extremely deep single-dish image at 450 μm (σ450 μm = 0.56 mJy beam−1), which is the deepest image yet obtained at 450 μm. A series of papers has been published based on STUDIES, including number counts (Wang et al. 2017), stellar morphology (Chang et al. 2018), and multiwavelength properties of the sample (Lim et al. 2020). In this work, we develop a machine-learning classifier based on 164 sources that have 450 μm flux density ≥ 4 mJy, signal-to-noise ratio (S/N) ≥ 4, and K-band counterparts from their radio or mid-infrared identifications. Our machine-learning classifier labels a sample of 6182 SMG candidates in the wider COSMOS field. We employ an autocorrelation technique on the SMG candidates to statistically estimate their clustering signal, which allows us to infer the dark matter halo mass and to constrain the clustering evolution of SMGs from z = 3 down to z = 0.5.

This paper is structured as follows. In Section 2, we introduce the ancillary data in the COSMOS field, as well as the observations, data reduction techniques, source extraction procedure, and training data set. In Section 3, we describe the machine-learning methodology we use for SMG candidate identification. In Section 4, we verify our machine-learning technique in selecting the SMG candidates. We present the comparison samples in Section 5 and the clustering properties of SMG candidates in Section 6. We summarize our findings in Section 7. Throughout this work, the standard errors of our sample distribution medians are estimated from bootstrap analysis. The term "SMG candidates" in this work represents the machine-learned candidates of 450 μm-selected SMGs, unless otherwise stated. We adopt cosmological parameters H0 = 70 km s−1 Mpc−1, ΩΛ = 0.70, Ωm = 0.30, and σ8 = 0.83 (Planck Collaboration et al. 2014).

2. Data

2.1. Main Data

The SCUBA-2 camera contains 5000 pixels (field of view ≃45 arcmin2) in each of the 450 and 850 μm detector arrays. The SCUBA-2 camera operates at 450 and 850 μm simultaneously and provides an unprecedented mapping speed, meaning that it can efficiently survey large areas of sky at 450 and 850 μm. The beam size of SCUBA-2 is 7farcs9 at 450 μm, which is an order of magnitude smaller in area compared to far-infrared observations from Herschel at similar wavelengths (24''–35''). The 450 μm data presented in this paper come from three sources: STUDIES (Wang et al. 2017), the data taken by Casey et al. (2013, hereafter C13), and the SCUBA-2 Cosmology Legacy Survey 450 μm campaign in the COSMOS field (S2CLS-COSMOS; Geach et al. 2013, 2017). We briefly describe these programs in turn.

STUDIES is a JCMT Large Program that aims to reach the confusion limit at 450 μm within the CANDELS (Grogin et al. 2011; Koekemoer et al. 2011) footprint in the COSMOS field. The mapping center of STUDIES is R.A. = 10h00m30fs7 and decl. = +02°26'40''. The CV DAISY mapping pattern was used, resulting in an 3' diameter area of approximately uniform coverage of high sensitivity, with noise level increasing outside of this area as a function of radius. In this work, we only adopt the STUDIES data that were collected until 2019 April, amounting to 252 hr of exposures, about 84% of the total allocated integration (330 hr) for the whole project. We note that data collection for STUDIES is now 99% complete (by 2020 April), and the final version of the deeper image and source catalog will be published in the future. In addition, as part of the S2CLS project, the 450 μm S2CLS-COSMOS was observed with two CV DAISY maps offset by 2' in decl. from the central pointing of R.A. = 10h00m30fs7 and decl. = +02°22'40'', with some overlap resulting in an area of ≃13 arcmin2 with noise level <5 mJy beam−1. The total on-sky integration of S2CLS-COSMOS is 150 hr. The survey of C13 used the PONG-900 scan pattern, mapping with a center of R.A. = 10h00m28fs0 and decl. = +02°24'00'', resulting in a wider circular map that reaches a uniform depth over an area of approximately 700 arcmin2. The total on-target time of C13 is 38 hr, and these data are much shallower compared to the STUDIES and S2CLS data sets. The majority of the aforementioned observations were conducted under the best submillimeter weather on Maunakea ("Band 1", τ225 GHz < 0.05, where τ225 GHz is the sky opacity at 225 GHz).

The procedure used for data reduction are similar to those in Wang et al. (2017) and Lim et al. (2020). In brief, we apply the following steps.

  • 1.  
    The raw time-stream data from SCUBA-2 were flat-fielded using the flat scans that bracket each science observation, and the data were scaled to units of pW.
  • 2.  
    The time streams were then assumed to be a linear combination of noise and signal from the background (atmospheric water and ambient thermal emission), as well as astronomical objects. The procedure then entered an iterative stage that attempts to fit these components with a model by using the Dynamic Iterative Map-Maker routine of Sub-Millimeter Common User Reduction Facility (SMURF; Chapin et al. 2013).
  • 3.  
    We then calibrated the individual reduced scans into units of flux density by using the weighted mean flux conversion factor (FCF) of 476 ± 95 Jy beam−1 pW−1. The adopted FCF was estimated from a subset of submillimeter calibrators observed under Band-1 weather during the survey campaigns and was consistent with the canonical value estimated from a wider base of calibrators (Dempsey et al. 2013).
  • 4.  
    Individual scans were then co-added in an optimal, noise-weighted manner, by using the MOSAIC_JCMT_IMAGES recipe from the PIpeline for Combining and Analyzing Reduced Data (PICARD; Jenness et al. 2008).
  • 5.  
    To improve the detectability of faint point sources, we convolved the map with a broad Gaussian kernel of FWHM = 20'' and subtracted the convolved map from the original maps to remove any large-scale structure in the sky background. We then convolved the subtracted map with a Gaussian kernel that is matched to the instrumental point-spread function (FWHM = 7farcs9, Dempsey et al. 2013). We used the SCUBA2_MATCHED_FILTER recipe in the PICARD environment for this procedure.

Finally, we constructed an extremely deep 450 μm image, with the STUDIES, C13, and S2CLS-COSMOS data combined (Figure 1). The final image covers a sensitive region of ≃300 arcmin2. The instrumental noise level in the deepest regions is ≃0.56 mJy beam−1, which is roughly 16% deeper than those in previous works (Lim et al. 2020).

Figure 1.

Figure 1. JCMT SCUBA-2 450 μm S/N image, showing the positions of the 164 S/N ≥ 4 and S450 μm ≥ 4 mJy sources with K-band counterparts based on the VLA and MIPS identifications (red circles). The cyan contours show the instrumental noise levels with contours at 1, 5, 9, 13, and 17 mJy.

Standard image High-resolution image

2.2. Ancillary Data

We employed radio and near-/mid-infrared identifications to construct a sample of SMG counterparts used for this work (Section 2.3). In the radio, we used the catalog from the Jansky Very Large Array (VLA) Large Project survey at 3 GHz (VLA-COSMOS; Smolčić et al. 2017). In the near-infrared, we employed the archival IRAC catalog (Sanders et al. 2007) obtained from the Spitzer Space Telescope. In the mid-infrared, we generated our own 24 μm catalog using SExtractor (Bertin & Arnouts 1996), since the archival MIPS 24 μm catalog (Sanders et al. 2007) only contains sources with ${S}_{24\mu {\rm{m}}}\gt 150$ μJy. We run SExtractor in the S-COSMOS 24 μm image (Sanders et al. 2007) and recalibrated our extracted 24 μm fluxes to their Spitzer General Observer Cycle 3 total flux. Our generated catalog has a 3.5σ detection limit of 57 μJy without using positional priors from other wavelengths. We verify that our generated catalog is in good agreement with the deep Spitzer catalog (${S}_{24\mu {\rm{m}}}\gt 80$ μJy) provided by Le Floc'h et al. (2009) with a median value of $| {\rm{\Delta }}{S}_{24\mu {\rm{m}}}| /{S}_{24\mu {\rm{m}}}={6}_{-4}^{+10} \% $ and with median flux peak offsets of Δα = 0farcs0 ± 0farcs3 and Δδ = 0farcs3 ± 0farcs3.

We utilized the multiwavelength band-merged COSMOS2015 catalog compiled by Laigle et al. (2016), which includes 30+ bands of photometric data, spanning from X-ray through the near-ultraviolet and optical to the far-infrared. We used the combined "FLAG_HJMCC = 0 (or 2)" and "FLAG_COSMOS = 1", which is the region covered by UltraVISTA-DR2 occupying an area of 1.6 deg2 in the COSMOS field. The near-infrared data (e.g., K-band) is essential for accurate photometric redshift and stellar-mass estimates, and the observed K-band magnitude correlates well with stellar mass up to z ∼ 4 (Laigle et al. 2016). To ensure a uniform selection, we further limited ourselves to galaxies with K-band magnitude of mK < 24.5 magAB (limiting magnitude at 3σ in a 2'' diameter aperture from Laigle et al. 2016). The COSMOS2015 catalog also provides stellar-mass and redshift estimations, which are used in our sample selection (Section 5) and clustering analysis (Section 6). The stellar-mass and redshift measurements in the COSMOS2015 catalog were fitted using the LE PHARE code (Arnouts et al. 1999; Ilbert et al. 2006). In this work, we do not exclude AGNs from our sample, and we refer readers to Laigle et al. (2016) for further details.

We employed a stacking technique to assess the averaged 450 and 850 μm flux density of our SMG candidates (Section 4) using the wide-field SCUBA-2 image from S2COSMOS (Simpson et al. 2019). S2COSMOS is an EAO Large Program that has mapped the entire COSMOS field to a uniform depth at 850 μm of σ = 1.2 mJy beam−1. Thanks to the dual-band observing capability of SCUBA-2, 450 μm data were simultaneously obtained. While S2COSMOS is designed for the 850 μm imaging (it has been carried out under weather conditions less suitable for 450 μm observations), the large area and the depth reached 1σ ≃ 12 mJy beam−1 at 450 μm allow us to obtain strong constraints on the stacked flux of the SMG candidates.

2.3. Training Sample

The methodologies employed here for source extraction and counterpart identification are similar to those in Lim et al. (2020). We briefly summarize the method here, referring readers to Lim et al. (2020) for further details.

We employed a source extraction method similar to the "CLEAN" deconvolution in radio interferometric imaging. We identified the most significant peak in the S/N map and subtracted 5% of a peak-scaled model point-spread function from the image at its position. The subtracted flux and coordinates were then cataloged, and the subtraction was iterated until a significance threshold floor (=3.5σ) was reached. We summed up the subtracted fluxes and the remaining threshold flux density and took this to be the final flux density for each source. In this work, we limited our 450 μm–selected sample to the sources with total integrated S/N over the "CLEAN"ed area ≥4 due to the relatively high fraction of spurious sources (≥20%) at S/N < 4. We further limited our sources to those with S450 μm ≥ 4 mJy beam−1 to achieve a more uniform selection and to address the nonuniform sensitivity coverage of our map. In total, we obtained 221 such sources from a region of ≃300 arcmin2.

To construct the K-band source catalog for our 450 μm–selected sample, we first cross-matched the positions of 450 μm sources with positions from the VLA-COSMOS 3 GHz radio catalog (Smolčić et al. 2017) using a 4'' search radius. In total, we found 131 VLA-identified sources, and this procedure is expected to produce ≃3 false matches out of the 131. The expected false matches are derived from the number density of the matched catalog at a certain search radius and then multiplied by the total number of matched sources. We then cross-matched the radio positions with coordinates from the Spitzer IRAC near-infrared catalog (Sanders et al. 2007) using a 1'' search radius (≃4 expected false matches out of 123). For the remaining 90 450 μm sources without radio counterparts, we cross-matched them to the Spitzer MIPS 24 μm catalog using a search radius of 4'' (≃4 expected false matches out of 65). Based on the 24 μm positions, we then made a positional matching in the IRAC near-infrared catalog by using a 2'' search radius (≃2 expected false matches out of 52). Once we obtained the IRAC positions from the mid-infrared or radio counterparts, we associated these sources with the band-merged COSMOS2015 photometric catalog (Laigle et al. 2016) using a search radius of 1''.

Among the 221 sources, 164 (74% ± 6%) of the SMGs have K-band counterparts, of which, 117 sources are VLA-identified and 47 are MIPS-identified. All of them are detected significantly according to a corrected-Poissonian probability identification technique (p-values <0.05; see Downes et al. 1986). We employed these 164 sources as our SMG data set for the machine-learning algorithm (Section 3). It is worth noting that 57 sources (26% ± 3% of total 221 sources) do not have MIPS/VLA counterparts and so they are not in the training set. This missing population does not exhibit significant variation in 450 μm flux density compared to the sources having MIPS/VLA identifications with p-value of 0.16 in a Kolmogorov–Smirnov test (KS test). These 57 sources most likely lie at higher redshifts (z ≳ 3; see also Figure 4 in Lim et al. 2020) due to the fact that the mid-infrared and radio wave bands do not benefit from a strong negative K-correction, so they are biased against identifying the high-redshift SMGs. Therefore, we do not expect a significant impact from this missing population on our final results (Section 6), which mainly focus on z < 3.

To construct a non-SMG sample for the training that is undetected by SCUBA-2 but within the SCUBA-2 footprint, we select 4705 K-band-selected sources that reside within the noise level ≤1 mJy beam−1 region of the 450 μm image. We adopt their K-band magnitudes and color–color pairs (i.e., flux ratios) to be the feature vectors in the machine-learning algorithm. Given the faint optical magnitudes of most of our SMG sample, we only adopt the broadband photometries from COSMOS2015 catalog in this work. In total, we have 79 features, of which, 78 features are derived by the interlacing color quantities from 13-band photometry (${{uBVri}}^{+}{z}^{++}{JHK}$[3.6][4.5][5.8][8.0]).

3. Machine-learning Methodology

3.1. Performance Measures of Classification

Before introducing our machine-learning methodology, we first describe how we verify the performance of the classification. In the field of machine learning, a confusion matrix is often used to describe the performance of a classification model (Table 1). The confusion matrix has four terms: true positive (TP) refers to an actual positive sample correctly labeled as positive; false positive (FP) is a sample incorrectly flagged as positive while in reality it is negative; false negative (FN) corresponds to real positive cases incorrectly flagged as negative; and true negative (TN) represents an actual negative sample correctly labeled as negative. From these categories, one can compute the precision, recall (in other words, the sensitivity or true positive rate, TPR), and false positive rate (FPR) as

The precision represents the proportion of all correct positive predictions, while recall is the recovery of all real positive cases that are predicted to be positive. Conversely, the FPR is the ratio between the number of actual negative cases wrongly categorized as positive and the total negative cases.

Table 1.  Confusion Matrix for Binary Classification

  Positive prediction Negative prediction
Positive class True Positive (TP) False Negative (FN)
Negative class False Positive (FP) True Negative (TN)

Download table as:  ASCIITypeset image

Several meaningful indicators are often used to verify the performance of a classifier. The f1-measure (Rijsbergen 1979) considers both precision and recall to compute a score. This f1-score can be interpreted as a harmonic average of precision and recall, which can be measured as

Equation (1)

The f1-score reaches its best value at 1 and worst at 0.

Another standard test for evaluating a binary decision problem is using the Receiver Operating Characteristic (ROC) Curve that is plotting TPR against FPR (Provost et al. 1997). A perfect classifier, which has no FN and FP, will have a high value of TPR and low value of FPR. Therefore, a higher value of the area under the ROC curve (AUROCC) corresponds to a better classifier.

3.2. Methodology

In this work, we adopt the extreme gradient boosting (XGBoost; Chen & Guestrin 2016) method that is designed based on a scalable gradient tree boosting learning, since the XGBoost performs the best for identifying the SMG candidates in our sample (Section 3.5; see also other similar works done with XGBoost, An et al. 2018, 2019; Liu et al. 2019). The idea of gradient tree boosting is to build an ensemble of simpler estimators (usually they are decision trees) and convert them sequentially into a complex predictor. During the iterations of building the trees, each tree will correct the error between the predictions and the actual output from the existing trees and minimize the training error of the ensemble. The contribution of each tree can be scaled to reduce the influence of each tree that will leave more space for future trees to improve the ensemble. This process will lead to a better model and prevent the behavior of over-fitting (Friedman 2000, 2002). Typically, smaller values of this weighting (i.e., learning rate or shrinkage) seem to produce a better performance (Friedman 2000).

XGBoost is designed to push the limit of computational resources and improve the model performance for the gradient tree boosting algorithm. XGBoost performs a split finding algorithm that enumerated over all possible splits on all features and finds the best split in tree learning. The advantage of XGBoost over other techniques is that it can handle missing features. When the algorithm encounters a missing value, it is labeled as going into the default direction, which was already learned from the data.

In this work, we define 70% of our data to be the training set and the rest as the test set, where the training set builds the model, and the testing set evaluates the model's performance. In principle, the predictor performs better with a larger fraction of training data. On the other hand, the performance statistic will have greater variance if there is less testing data (Kohavi 1995). To strike a balance, we have tried changing the ratios of our training and testing sets from 50:50 to 90:10, and we verify that the XGBoost performs the best in both AUROCC and f1-score with a 70:30 split.

To avoid over-fitting, we adopt an early stopping in XGBoost after five training iterations that do not yield any improvements. To control the balance of positive and negative weights, we set the scale_pos_weight = 28.7, which is given by the ratio between the number of negative and positive instances (4705/164; see Section 2.3).

3.3. Feature Selection

Feature selection is an important process in machine learning that strongly influences the performance of the model. Feature selection is a procedure of selecting a subset of relevant features for model construction. In general, proper feature selection can increase the efficiency by reducing the training duration, enhancing generalization by reducing over-fitting, and improving the prediction performance (see Chen & Guestrin 2016; An et al. 2019).

A trained XGBoost model will automatically calculate feature importance and provide the feature importance scores. In this work, we first trained the XGBoost model based on the training data set and selected features by sorting feature importance scores calculated from the trained model. We then iteratively trained the model based on the selected subset of features until the point of best performance. We verified that both f1-score and AUROCC increase with the number of selected features until we use up all 79 features. Therefore, we did not reduce the number of feature vectors in this work. Considering the limitations of our computational resources, we repeated this procedure 10 times by using a different combination of training and test data sets in each iteration and estimated the average feature importance score from these 10 realizations. The top five important features in our sample are: K, ([3.6]−[4.5]), (K − [4.5]), ([3.6]–[5.8]), and (HK), which are similar to those photometric wave bands used by Chen et al. (2016; Optical-Infrared Triple Color: (zK), (K − [3.6]), and ([3.6]–[4.5])) and An et al. (2019; (zK), (JK), (K − [3.6]), and ([3.6]–[4.5])). The top five important features in our sample can be associated to fundamental physical properties. The K-band flux roughly maps to the stellar mass, while the SMGs appear to be red and occupy a relatively well-defined region in near-infrared color–color space (see also Chen et al. 2016; An et al. 2019).

3.4. Tuning the Hyper-parameters

The hyper-parameters are a set of parameters that define the machine-learning algorithm as a mathematical formula. The hyper-parameters act as tuning functions that are set during the training of the model. We optimized the hyper-parameters in XGBoost using k-fold cross-validation (An et al. 2019). This is a resampling procedure used to iteratively evaluate the performance of machine-learning models on a limited data sample. In each iteration, this procedure randomly divides the training data set into k groups/folds (approximately equal size). Each unique fold is treated as a validation set, and the model is fitted on the remaining $k-1$ folds. The validation set is replaced k times, and the average performance measure of the k sets is then reported. In this work, we used k = 5 and adopted the AUROCC as the scoring function to optimize the hyper-parameters of the XGBoost classifier. We used the RandomizedSearchCV recipe from the python-based scikit-learn package (Pedregosa et al. 2011). RandomizedSearchCV randomly searches over a combination of hyper-parameter space and finds the best solution for the constructed model. In the same computational budget, RandomizedSearchCV can find models that are as good or better compared to a pure grid search (Bergstra & Bengio 2012). In this work, we set the number of iterations to 200. We verify that both f1-score and AUROCC converge after 200 iterations (confirming up to 300 iterations).

3.5. Algorithms Comparison

We repeated the procedure in Section 3.4 10 times by using exactly the same combinations of training and test data sets used in Section 3.3 for each iteration. Each optimizer gives different results in labeling the SMG candidates in accordance with the expectation. We verify that the labeled SMG candidates change within 10% from each optimizer. The mean performance from these 10 optimizers is summarized in Table 2.

Table 2.  Performance of Test Samples Using Several Machine-learning Methods

Methodology AUROCC f1-score Precision Recall
Adaptive boosting 0.97 ± 0.01 0.55 ± 0.07 0.80 ± 0.07 0.43 ± 0.09
Decision tree 0.92 ± 0.02 0.36 ± 0.03 0.23 ± 0.03 0.87 ± 0.07
Logistic regression 0.97 ± 0.01 0.56 ± 0.07 0.77 ± 0.08 0.46 ± 0.10
Random forest 0.97 ± 0.01 0.54 ± 0.07 0.81 ± 0.08 0.41 ± 0.07
Stochastic gradient boosting 0.96 ± 0.01 0.61 ± 0.03 0.74 ± 0.06 0.53 ± 0.05
Support vector machines 0.91 ± 0.04 0.51 ± 0.04 0.78 ± 0.06 0.38 ± 0.05
XGBoost 0.97 ± 0.01 0.57 ± 0.07 0.51 ± 0.14 0.73 ± 0.12
XGBoost (with missing values) 0.97 ± 0.01 0.63 ± 0.02 0.59 ± 0.04 0.68 ± 0.07

Note. We replace the missing data with the detection limits of each feature vector, since most of the other algorithms do not handle missing values. The XGBoost method with missing values (marked in bold) performs the best for both AUROCC and f1-score.

Download table as:  ASCIITypeset image

We also test the performance of XGBoost against other machine-learning algorithms. To make a fair comparison, we replaced the missing data with the detection limits of each feature vector since most of the other algorithms do not handle missing values. We optimized each of the algorithms (similar to what was done in Section 3.4) and repeated the procedure 10 times by using the exactly same combinations of training and test data sets used in Section 3.3 in each iteration. As shown in Table 2, the XGBoost method with missing values performs the best in terms of both AUROCC and f1-score. Furthermore, we verify that the values of AUROCC, f1-score, precision, and recall do not change significantly with redshift when we review our test data set in specific redshift bins.

3.6. Result from the XGBoost Algorithm

To accommodate the different results from each optimizer due to different combinations of training and test data sets, we adopt the following approach so that the training results are close to the mean performance from the 10 optimizers mentioned in Section 3.4. We combined the predicted class probabilities (output from predict_proba algorithm in scikit-learn package) from each optimizer by using a combined probability formula ${\left({P}_{1}{P}_{2}...{P}_{n}\right)}^{\tfrac{1}{n}}$, where n is 10 in this work. We labeled the sources that have a final class probability ≥0.5 (the default threshold for two-classes classification in XGBoost), as SMG candidates. This procedure is similar to the bootstrap aggregating, so called bagging (Breiman 1996). Bagging is a two-step process: bootstrapping and aggregating. Bootstrapping is a sampling method that randomly selects several subsets of samples from the entire data set. The individual subset of samples is then taken as the training data set for the machine-learning models. The aggregation then combines the model predictions from those subsets into a final prediction considering all possible outcomes. In short, the bagging procedure can generate an aggregated predictor from multiple predictors, which can improve the stability and accuracy of machine-learning algorithms and reduce variance to avoid over-fitting (Breiman 1996).

To validate the performance of the aggregated predictor, we isolated 30% of our sample as the independent test sample and split the remaining 70% into 50%:20% for training. We trained the 50%+20% data set 10 times in XGBoost, in which the 50% and 20% subsamples were randomly drawn in each iteration. We used the trained classifiers to estimate the predicted class probabilities of the isolated 30% test set in each iteration and performed the bagging procedure in their probabilities. We confirmed that the performance of the aggregated predictor (precision and recall) is almost the same as the mean performance (precision and recall) from those 10 iterations, indicating that the aggregated predictor is a representative classifier for a number of classifiers. Therefore, we conclude that the performance of the aggregated classifier (after the bagging procedure) should be similar to the mean performance shown in Table 2.

Based on the adopted training results, there is a non-negligible degree of misidentification (precision = 0.59 ± 0.04) and incompleteness (recall = 0.68 ± 0.07) in our classifications. It is therefore worth investigating which population is labeled incorrectly as SMGs while, in reality, it is not (i.e., the FP) and what fraction of SMGs are labeled incorrectly as field galaxies (i.e., the FN) by our trained XGBoost algorithm. To do this, we investigate the properties of FP in each of the 10 iterations. The median stellar mass and median redshift of our training SMGs are log(M*/M) =${10.76}_{-0.02}^{+0.04}$ and $z={1.64}_{-0.07}^{+0.16}$, respectively, and their median stacked flux is (7.1 ± 0.5) mJy at 450 μm based on our STUDIES image. We find that the overall properties of FP have a median redshift of z = 2.0 ± 0.3, median stellar mass of log(M*/M) = 10.7 ± 0.1, and median stacked flux of (2.5 ± 0.5) mJy at 450 μm, based on our STUDIES image. These findings show that the FP has a similar stellar mass to the training SMG sample but appears to have slightly higher redshifts compared to the training SMG sample and is less active in obscured star formation. Meanwhile, we carry out the same investigation for the FN. The overall properties of FN have median redshift z = 1.4 ± 0.2, median stellar mass log(M*/M) = 10.5 ± 0.1, and median stacked flux (6.0 ± 0.5) mJy at 450 μm. These findings show that our classifier may miss the population that is slightly less massive and lower in redshift compared to our training SMG sample. We conclude that the effectiveness of our classifier could be hindered by the fuzzy boundary between the SMGs and the less-active star-forming galaxies having similar stellar masses. We do not expect that this finding will impact our final clustering measurements significantly in Section 6, where we show that the clustering signals of star-forming galaxies are very similar to the SMG candidates once they are matched in redshift and stellar mass.

As an additional test, shown in the Appendix, we have applied two distinct machine-learning algorithms with different levels of recall and precision and show that the clustering signals recovered for the SMGs are robust. The SMG candidates that are identified by decision tree (better recall) and random forest (better precision) yield clustering signals that are not significantly different from the results of XGBoost-identified SMG candidates, indicating that our final results are insensitive to the chosen classifiers (see the Appendix).

The spurious sources in the SMG training sample, which are wrongly labeled as SMGs, may impact the training sample and bias the performance estimators. We check that the cumulative spurious fraction of our SMG training sample is ≃10%. To quantify how sensitive our training is to this contamination, we randomly choose 10% of the SMG training sample and swap them with the MIPS- or VLA-detected non-SMGs in the STUDIES field. We then train the XGBoost by using this training sample and repeat the procedure 10 times. The mean AUROCC and f1-scores from these procedures are reduced by 3% and 10%, respectively, compared to the measurements shown in Table 2. These offsets are expected, since the contamination in the training sample will reduce the precision of the predictor. Considering the uncertainties in the measured AUROCC and f1-score, we conclude that our final clustering measurements (Section 6) will not be impacted significantly by the small fraction of spurious sources.

In total, our trained XGBoost algorithm labels 6182 SMG candidates from the COSMOS2015 catalog containing 307,374 sources that have an mK < 24.5 magAB across an effective area of 1.6 deg2.

4. Verifications of the SMG Candidates

While machine learning is a powerful and promising technique for data analysis, it inevitably appears to be a black box to typical users. Checks need to be performed to ensure that the selected SMG candidates have properties (especially those not used for the training) similar to those of the parent SMG sample.

The total number of SMG candidates (6182 sources) is consistent with the predicted number counts of ${6100}_{-1400}^{+1800}$ of 450 μm sources in the field. The predicted number count is derived by integrating the best-fitted function of Wang et al. (2017) over a flux range of 3.4–36 mJy. The predicted median flux in this flux density range is (6.8 ± 0.1) mJy. The lower end of the range is determined from the boosting corrected 450 μm flux density range of our training SMG data set. We further apply the fraction of 0.74 ± 0.06 to take into account the selection bias in our training data set (see Section 2.3), since our training SMG data set only includes SMGs having K-band counterparts.

To probe the submillimeter emissions of our 6182 SMG candidates, we directly measure their 450 μm flux density from the S2COSMOS image. The 450 μm flux distribution of SMG candidates is consistent with the measurement from the SMG training sample, based on the same image (p-value = 0.36 in KS test). The median stacked 450 μm flux density from the S2COSMOS image of the SMG candidates is (5.3 ± 0.2) mJy (Figure 3). This value is tentatively offset (≃2.3σ) from the median stacked flux of the SMG training sample (7.8 ± 1.1) mJy derived from the same S2COSMOS image but significantly offset from the aforementioned predicted median flux of (6.8 ± 0.1) mJy and the median stacked flux of the SMG training sample, based on our deeper STUDIES image (7.1 ± 0.5) mJy. The stacked value of the SMG training sample could be biased high, since sources close to the detection threshold will be included if they are on a peak of the noise. Indeed, the deboosted 450 μm stacked flux of the SMG training sample is 6.4 ± 0.5 mJy, only marginally higher than the stacked flux of the SMG candidates. To have an independent test, we apply the stacking analyses in the 850 μm images from the S2COSMOS and STUDIES surveys by assuming that the 450 and 850 μm noise maps are reasonably independent. The median stacked 850 μm flux density from the S2COSMOS image of our SMG candidates is (1.23 ± 0.03) mJy. This value is significantly offset from the median stacked 850 μm flux of the SMG training sample (1.9 ± 0.2) mJy based on the same image, but tentatively offset (≃1.9σ) from similar measurements based on our deeper STUDIES image (1.7 ± 0.2) mJy. The fainter stacked flux of the SMG candidates compared to the stacked flux of SMG training sample is expected, since our machine-learning algorithm misidentifies a fraction of SMGs (≃30%; see the recall value in Table 2). However, the stacked flux only provides the overall emission properties of our sample. We do not expect that our results in the following analyses will be affected significantly due to contamination, since the contamination arises from less dusty star-forming galaxies with similar stellar masses to the training SMG sample (see Section 3.6), and the clustering signals of star-forming galaxies are very similar to the SMG candidates once they are matched in redshift and stellar mass (Section 6).

All of the SMG training sample in the machine-learning algorithm has matched detections from the MIPS at 24 μm or VLA at 3 GHz, or both (see Section 2.3). The fraction of our SMG candidates that have MIPS and/or VLA detections is ≃88% ± 1% (=5442/6182), indicating that our SMG candidates behave similarly at mid-infrared and/or radio wavelengths compared with the SMG training sample. In addition, the MIPS- and VLA-flux distributions of our SMG candidates are consistent with the SMG training data set, having p-values of 0.77 and 0.31 in the KS test, respectively. Based on the S2COSMOS image, the stacked flux of our SMG candidates with MIPS or VLA detections is (5.6 ± 0.2) mJy, while the stacked flux of the remaining SMG candidates is (3.0 ± 0.5) mJy. This indicates that ≃12% (=740/6182) of our SMG candidates are misidentified and/or are biased toward fainter SMG population. Again, we do not expect that our results in the following analyses will be affected significantly due to this finding. It is worth noting that the MIPS 24 μm and VLA 3 GHz photometry are not part of the training features (Section 3.3). The high MIPS- or VLA-detection rates strongly support the reliability of our machine-learning algorithm.

By comparing other populations extracted from the COSMOS2015 catalog, we can test whether the distribution of our SMG candidates is consistent with the SMG training sample. Figure 2 shows the normalized histograms of photometric redshift and stellar mass for the 450 μm-selected SMG training sample, 450 μm-selected SMG candidates, and field galaxies. The median redshift of our SMG candidates is z = 1.69 ± 0.02 (Figure 2(a)), which is in excellent agreement with that of the SMG training sample ($z={1.64}_{-0.07}^{+0.16}$). The redshift distribution of our SMG candidates is consistent with the SMG training data set, having a p-value = 0.63 in the KS test, indicating that we cannot reject the null hypothesis of no difference between our SMG candidates and parent SMG training sample. On the other hand, the p-value is essentially zero in the KS test between the redshift distribution of our SMG candidates and that of the field galaxies in the COSMOS2015 catalog.

Figure 2.

Figure 2. Panel (a): normalized histograms of photometric redshift; and panel (b): normalized histograms of stellar mass for the SMG training sample, SMG candidates, and field galaxies. The median values are marked as downward arrows for the corresponding sample. The median redshifts and stellar masses of SMG candidates are consistent with the estimations from the parent SMG training data set.

Standard image High-resolution image
Figure 3.

Figure 3. Left panel: stacked SCUBA-2 450 μm image from the S2COSMOS survey (Simpson et al. 2019) at the positions of 6182 SMG candidates. The negative flux trough surrounding the source is caused by the matched-filter procedure from the SCUBA2_MATCHED_FILTER recipe (see Section 2). The stacked 450 μm flux density is (5.3 ± 0.2) mJy. Right panel: stacked 450 μm image at 6182 random positions.

Standard image High-resolution image

Figure 2(b) shows the stellar-mass distributions of the SMG candidates, SMG training sample, and the field galaxies over a redshift range of z = 0–6. The median stellar mass of our SMG candidates (log(M*/M) = 10.83 ± 0.01) is consistent with the median stellar mass of the SMG training sample (log(M*/M) =${10.76}_{-0.02}^{+0.04}$). A KS test performed on the stellar-mass distributions of SMG candidates and parent SMG training data set shows that the test cannot distinguish between these two populations at the 95% significance level (p-value = 0.15). The KS test rejects the null hypothesis that the stellar-mass distributions of our SMG candidates and field galaxies are drawn from the same distribution (p-value ≃0). The similarities of physical properties (e.g., redshift and stellar mass) between our SMG candidates and the parent SMG training data set are expected. This is because the features for estimating the physical properties in the COSMOS2015 catalog, which are based on the photometric data, are similar to the adopted features in our machine-learning algorithm.

We also test our SMG-identification technique on ALMA observations. ALMA follow-up observations resolved 260 850 μm–selected SMGs in the S2CLS-COSMOS program (AS2COSMOS; J.M. Simpson et al. 2020, in preparation) from 183 850 μm submillimeter sources with S/N > 4.3σ (rms ≃0.2 mJy). There are 165 ALMA-detected sources with K-band counterparts in the COSMOS2015 that further satisfy our selection criterion of mK < 24.5 magAB (see Section 2.2). Among these, our machine-learning algorithm successfully identifies 126 ALMA-detected sources as SMG candidates, indicating that the completeness of our identification is 76% ± 7%. On the other hand, 148 SMG candidates are located within the 183 ALMA primary-beam areas (FWHM = 17farcs3). Among them, 121 sources are detected by ALMA, suggesting that the precision of our machine-learning classifier is 82% ± 7%. However, the ALMA observations should not necessarily have high identification rates in our sample, since ALMA and JCMT were observing in different wavelengths, and therefore, the sensitivity limits are different. To coarsely estimate the 450 μm sensitivity in the AS2COSMOS survey, we employ the typical S450 μm/S850 μm ratio of 2.5–4.5 at the faint end (Hsu et al. 2016). The sensitivity of AS2COSMOS (≃0.86 mJy) is equivalent to a 450 μm sensitivity of ≃2–4 mJy, which is roughly our selection limit. Therefore, we conclude that our SMG candidates should be detected with AS2COSMOS, and the high identification rate of AS2COSMOS in our sample is expected, which again supports the reliability of our machine-learning algorithm.

5. Comparison Samples

To put the SMG candidates into the context of general galaxy populations at the same redshifts, we construct comparison samples of star-forming galaxies and passive galaxies.

5.1. Rest-frame NUV–r–J Color

Our comparison star-forming and passive galaxies are selected based on the rest-frame ${M}_{\mathrm{NUV}}-{M}_{r}$ and ${M}_{r}-{M}_{J}$ color cuts (Ilbert et al. 2013) for galaxies not flagged as SMG candidates by our machine-learning algorithm. To avoid incompleteness, in the following analyses, we only consider sources that have stellar-mass estimates above the 95% mass completeness. To empirically estimate the stellar-mass completeness as a function of redshift, we follow a procedure similar to that in Pozzetti et al. (2010). We take the faintest 20% of galaxies in K-band magnitude in several redshift bins to be a representative observational limit for our whole sample. We then find the 95th upper percentile from the stellar-mass distribution of this subsample in each of the redshift bins and take the values to be the stellar-mass limit for the corresponding redshift bins. The 95% mass completeness as a function of redshift can be described with a polynomial function log(Mlim/M) = 8.14 + 0.95z − 0.09z2.

Since it is known that galaxy clustering evolves with redshift and is a strong function of stellar mas (McCracken et al. 2015), to make a fair comparison, we need to construct a sample of comparison galaxies that are matched as closely as possible to our SMG candidates in redshift, stellar mass, and sample size. We first adopt the binned_statistic_2d algorithm from scipy package (Jones et al. 2001) to generate the two-dimensional histograms of our SMG candidates and comparison galaxies by using specific redshift and stellar-mass bins. We then randomly select a number of comparison galaxies in each bin, which is matched with the SMG candidates. The normalized redshift distributions for our SMG candidates and comparison samples are shown in Figure 4(a). As we can see, the number of passive galaxies drops at z > 2. Similarly, it is also hard to find sufficient numbers of massive star-forming galaxies that are not SMG candidates at z > 2. Therefore, in this work, we restrict the comparison samples to the galaxies at 0.5 < z < 2. In total, we randomly select 3021 and 3083 passive galaxies and star-forming galaxies, respectively, at 0.5 < z < 2 (summarized in Table 3). The redshift estimations for SMG candidates may be less reliable compared to the comparison passive and star-forming galaxies, since SMG candidates are expected to be dusty and consequently fainter in the optical. Interestingly, at 0.5 < z < 2, the median redshift errors for SMG candidates (σz = 0.02 ± 0.06) is statistically consistent with the comparison passive (σz = 0.02 ± 0.06) and star-forming galaxies (σz = 0.03 ± 0.06). Therefore, we conclude that the redshift estimations for all of our samples are equally reliable. The stellar-mass distribution for sources at 0.5 < z < 2 is plotted in Figure 4(b). The samples all show similar stellar-mass distributions according to the KS test (p-value >0.05).

Figure 4.

Figure 4. Panel (a): normalized redshift distributions for SMG candidates, and comparison samples of star-forming galaxies and passive galaxies. All of these populations are matched as closely as possible in redshift, stellar mass, and sample size. In this work, we restrict the comparison samples to the galaxies at 0.5 < z < 2 (gray shaded region), since it is difficult to find sufficient numbers of passive and massive star-forming galaxies at z > 2. Panel (b): normalized stellar-mass distributions for SMG candidates and comparison samples of star-forming galaxies and passive galaxies at 0.5 < z < 2.

Standard image High-resolution image

Table 3.  Results of Clustering Analyses

Sample Redshift Nsa b r0 log(Mhalo)
        (h−1 Mpc) (h−1 M)
SMG candidates 0.5 < z < 1.0 1112 2.4 + 0.3−0.3 ${8.1}_{-1.1}^{+1.0}$ ${13.4}_{-0.2}^{+0.2}$
  1.0 < z < 2.0 2205 ${3.9}_{-0.4}^{+0.4}$ ${9.8}_{-1.1}^{+1.1}$ ${13.4}_{-0.2}^{+0.1}$
  2.0 < z < 3.0 1518 ${5.9}_{-0.9}^{+0.7}$ ${10.9}_{-1.7}^{+1.5}$ ${13.2}_{-0.2}^{+0.2}$
Passive galaxies 0.5 < z < 1.0 1112 ${3.3}_{-0.2}^{+0.2}$ ${11.5}_{-0.8}^{+0.8}$ ${13.9}_{-0.1}^{+0.1}$
  1.0 < z < 2.0 1909 ${4.3}_{-0.5}^{+0.4}$ ${10.8}_{-1.3}^{+1.2}$ ${13.5}_{-0.2}^{+0.1}$
Star-forming galaxies 0.5 < z < 1.0 1090 ${2.5}_{-0.3}^{+0.3}$ ${8.4}_{-1.0}^{+1.0}$ ${13.5}_{-0.2}^{+0.1}$
  1.0 < z < 2.0 1993 ${4.1}_{-0.5}^{+0.4}$ ${10.4}_{-1.3}^{+1.2}$ ${13.5}_{-0.2}^{+0.1}$

Note.

aSample sizes of our samples in the corresponding redshift bins.

Download table as:  ASCIITypeset image

5.2. Physical Properties of SMG Candidates and Comparison Samples

In this section, we compare some physical properties between our SMG candidates and comparison samples. The stellar mass, dust extinction, and age of the stellar population assembled in the model can be recovered from SED template fitting. In this work, we adopted the COSMOS2015 catalog values, estimated with the LE PHARE code. In short, they used a library of synthetic spectra generated using stellar population synthesis from Bruzual & Charlot (2003) and assuming a Chabrier (2003) initial mass function. Both declining and delayed star formation histories were used. The SEDs were generated for a grid of 42 ages in the range 0.05–13 Gyr, and the attenuation curve of Calzetti et al. (2000) was applied to the templates with color excess in range of E(B − V) = 0–0.7.

Our results show that the SMG candidates, which are expected to be dusty, tend to have higher values of E(B − V), compared to star-forming and passive comparison samples (Figure 5(a)). A significant fraction of our SMG candidates have E(B − V) = 0.7, which is a cap on the maximum E(B − V) value introduced in the LE PHARE SED fitting procedure. This implies that the extinction corrections, which are applied to the sources in the COSMOS2015 catalog, may be insufficient, and consequently, their dust-corrected SFR estimations may be underestimated, particularly for the dusty population (see also Casey et al. 2013; Simpson et al. 2014; Elbaz et al. 2018; Dudzevičiūtė et al. 2020).

Figure 5.

Figure 5. Physical properties of SMG candidates and comparison galaxy samples. Panel (a): normalized histograms of extinction; panel (b): normalized histograms of stellar population age; panel (c): effective radius and stellar-mass relations; and panel (d): histograms of Sérsic index. In panels (a), (b), and (d), the median values are marked as downward arrows for the corresponding sample. In panel (c), the larger points show the running median of our sample and the ±1σ scatter, while the smaller circles show the effective radius and stellar-mass relation of individual sources for the corresponding sample. In panel (c), we also show the best-fit size–mass relations of star-forming and passive galaxies at z = 1.25 from van der Wel et al. (2014), which correspond to the median redshifts of our sample. In summary, we find that the SMG candidates are dustier, younger, larger, and more disk-like than the comparison samples that are matched in redshift and stellar mass.

Standard image High-resolution image

Figure 5(b) shows histograms of model stellar population age for our SMG candidates and comparison samples. It is worth noting that the age measurements derived from SED fitting alone should perhaps be treated with skepticism (see Dudzevičiūtė et al. 2020), since the galaxy colors become redder with increasing extinction or age (the age-dust degeneracy; e.g., Calzetti 2001; Pforr et al. 2012). This degeneracy might be more severe in the starburst systems such as SMGs (Hainline et al. 2011; Michałowski et al. 2012a; Simpson et al. 2014). On the other hand, the age measurements are highly dependent on the assumed star formation history, in which the bursty star formation models tend to produce the youngest ages, while the continuous star formation models tend to produce the oldest ages (Maraston et al. 2010; Hainline et al. 2011). Nevertheless, the distribution of stellar population age in SMG candidates peaks at younger ages (Figure 5(b)), even though the overall distribution of stellar masses is similar to that in comparing star-forming and passive galaxies. This suggests that the SMG candidates may be galaxies with more recent star formation (i.e., a higher proportion of young stars), since the amount of attenuation in the rest-frame ultraviolet or optical is so large that the SED fitting code will have to de-redden the optical SED to the youngest available stellar populations.

To investigate the spatial structure of our SMG candidates and comparison samples, we adopt the morphological properties of Hubble Space Telescope (HST) HF160W-selected galaxies in the CANDELS/COSMOS field from van der Wel et al. (2012). We only use the sources with "flag = 0" in this catalog, which consists of the objects with a good Sérsic model fits (Sérsic 1963, 1968) measured by the GALFIT code (Peng et al. 2010). According to van der Wel et al. (2012), the structural parameters can be measured with a precision and accuracy better than 10% down to HF160W ≃24.5 magAB. It is worth noting that our work here is similar to that in Chang et al. (2018). Chang et al. (2018) used a sample of SCUBA-2 SMGs with S450 μm > 2 mJy, while we adopt machine-learned SMG candidates with S450 μm > 4 mJy (see Section 4). Our sample size is about 50% larger than that in Chang et al. (2018), since we can use the entire catalog from van der Wel et al. (2012) for our SMG candidates. In total, we have 91 SMG candidates, 102 star-forming galaxies, and 99 passive galaxies with a robust Sérsic model fit.

Figure 5(c) shows the effective radius (Re) at a rest-frame wavelength of ≃5000 Å and stellar-mass relations for our SMG candidates and the comparison samples. To determine the Re value at a rest-frame wavelength of ≃5000 Å, we follow Equations (1) and (2) in van der Wel et al. (2014), which consider the wavelength dependence of Re as a function of redshift and stellar mass. In Figure 5(c), we also show the best-fit size–mass relations of star-forming and passive galaxies at z = 1.25 from van der Wel et al. (2014), which correspond to the median redshifts of these subsamples. The comparison passive galaxies are, on average, smaller than star-forming galaxies, which is consistent with earlier studies on both local and high-redshift samples (Shen et al. 2003; Ichikawa et al. 2012; Newman et al. 2012; Fernández Lorenzo et al. 2013; van der Wel et al. 2014). The median Re of our SMG candidates (${5.5}_{-0.4}^{+0.3}$ kpc) is consistent with previous studies of ALMA follow-up at LABOCA 870 μm–selected SMGs at z ≃ 2 (${R}_{{\rm{e}}}={4.4}_{-0.5}^{+1.1}$ kpc; Chen et al. 2015), SCUBA-2 SMGs at z = 0.5–1.5 (Re = 4.9 ± 0.3 kpc; Chang et al. 2018), and ALMA follow-up at SCUBA-2 SMGs (Re = 4.8 ± 0.3 kpc; Lang et al. 2019), within the errors, but somewhat higher than the measurement of radio-identified SMGs at z ≃ 2 (Re = 2.8 ± 0.4 kpc; Swinbank et al. 2010). Chen et al. (2015) attributed this to the fact that Swinbank et al. (2010) used shallower HST-NICMOS images, which tend to show smaller sizes. Our result shows that the SMG candidates are significantly (≃3σ) more extended than the comparison star-forming galaxies (${R}_{{\rm{e}}}={4.0}_{-0.2}^{+0.3}$ kpc). Even though we split our samples by stellar mass, we still find the trend that the SMG candidates are slightly larger than comparison star-forming galaxies in all mass bins (Figure 5(c)). The two-dimensional KS test (python-package ks2d2s; original estimation references from Peacock 1983) in the size–mass plane shows that the SMG candidates are significantly different from comparison star-forming galaxies (p-value = 0.006). We therefore confirm and strengthen the tentative findings in Chang et al. (2018), in which they showed that the 450 and 850 μm–selected SMGs are marginally different from a stellar-mass- and SFR-matched star-forming sample.

The Sérsic indices (ns) for our SMG candidates and comparison samples, which are from van der Wel et al. (2012), are shown as histograms in Figure 5(d). The median ns of our SMG candidates (1.1 ± 0.1) is consistent with previous studies, including ns = 1.4 ± 0.8 (Swinbank et al. 2010), ns = 1.2 ± 0.3 (Chen et al. 2015), ns = 1.1 ± 0.1 (Chang et al. 2018), and ns = 1.0 ± 0.2 (Lang et al. 2019). Our SMG candidates are statistically distinguishable, with a lower median of ns, compared to comparison star-forming (ns = 1.9 ± 0.1) and passive galaxies $({n}_{{\rm{s}}}={3.2}_{-0.1}^{+0.2})$. However, our result does not necessarily show that the SMG candidates are dominated by disk-like structures (ns ≃ 1), since most of the SMGs with low ns (≃1) are in fact visually classified as either irregulars or interacting systems (Chen et al. 2015).

In summary, we find that the SMG candidates are dustier, younger, larger, and more disk-like than the comparison samples that are matched in redshift and stellar mass. With these differences in mind, in the next section, we discuss their clustering properties.

6. Clustering and Halo Mass

We now investigate the standard two-point clustering statistics that can quantitatively measure the large-scale structure of the universe and trace the amplitude of galaxy clustering as a function of scale. The clustering measurements can further allow us to infer the dark matter halo mass and to constrain the evolution of clustering.

6.1. Two-point Autocorrelation Function

We measure the two-point galaxy autocorrelation function by using the Landy & Szalay (1993) estimator,

Equation (2)

where DD(θ), DR(θ), and RR(θ) are the normalized number of galaxy–galaxy, galaxy–random, and random–random pairs, respectively, counted within a given angular separation bin of θ ± δθ/2. The DR(θ) and RR(θ) are normalized to the same total number of pairs as DD(θ), with $\mathrm{DR}(\theta )=\tfrac{({N}_{{\rm{D}}}-1)}{2{N}_{{\rm{R}}}}{N}_{\mathrm{DR}}(\theta )$, and $\mathrm{RR}(\theta )=\tfrac{{N}_{{\rm{D}}}({N}_{{\rm{D}}}-1)}{{N}_{{\rm{R}}}({N}_{{\rm{R}}}-1)}{N}_{\mathrm{RR}}(\theta )$, where ND and NR are the number of sources in the galaxy and random samples, respectively, while NDR(θ) and NRR(θ) are the original counts of galaxy–random and random–random pairs, respectively. We adopted 10 times as many random points as the number of our galaxies.

The projected angular two-point autocorrelation function, ω(θ), can generally be described as a power law,

Equation (3)

where A is the clustering amplitude, and δ is the slope of the correlation function. The value δ = 0.8 has been found to be appropriate for both observations and theories at the physical separation of ≃0.1–10 h−1 Mpc (e.g., Peebles 1980; Davis & Peebles 1983; Zehavi et al. 2005; Coil et al. 2007, 2008). This statement still holds for the galaxy samples at redshift up to ≃5 (δ = 0.8–1.1; Ouchi et al. 2005; Coil et al. 2006; Kashikawa et al. 2006; Lee et al. 2006; Hildebrandt et al. 2009; Durkalec et al. 2015; Harikane et al. 2016; Jose et al. 2017).

Measurement of ω(θ) could be biased, since our sample is located in a single, area-limited region that may not be representative of the true underlying mean density over the whole sky. The observed ω(θ) is usually biased lower than the true value ω(θ)mod,

Equation (4)

by an additive factor of IC known as the integral constraint. In practice, IC can be numerically estimated (e.g., Infante 1994; Roche & Eales 1999; Adelberger et al. 2005) over the survey geometry using the random–random pairs of the form

Equation (5)

We repeated the estimate of Equation (2) 25 times. We calculated the variance, mean ω(θ), and the mean Nrr using these 25 estimates in Equation (5). We then employed a χ2 minimization in Equation (4) to derive the best-fit ω(θ)true over scales of 0farcm1–6' (≃0.1–6 h−1 Mpc at z = 2 for co-moving distance). The IC-corrected ω(θ) is shown as black symbols in Figure 6.

Figure 6.

Figure 6. Two-point autocorrelation function of our SMG candidates and the comparison samples at 0.5 < z < 3.0. The data points are offset horizontally, to avoid confusion. The dotted curves show the autocorrelation functions of the dark matter in the corresponding redshift bins.

Standard image High-resolution image

The errors in the clustering amplitude of ω(θ)true are expected to be underestimated, since the variance only accounts for the shot noise from the sample of the random points and the Poisson uncertainties of the DD counts, and it does not include the field-to-field variance (often called cosmic variance) of a field-limited survey. To quantify the uncertainty more realistically, we employed the "delete one jackknife" resampling method (see also Scranton et al. 2002; Zehavi et al. 2002, 2005; Norberg et al. 2009) to estimate the covariance matrix for the autocorrelation function measurements. The principle of the jackknife method is to first divide the data set into N independent subsamples, and then, one copy of the subsamples is omitted systematically at a time, similar to the k-fold cross-validation used in machine learning. Therefore, the resampling data set consists of Nsub − 1 remaining subarea, with area (Nsub − 1)/Nsub times the full area of the original data set. The covariance matrix of N independent realizations can be obtained as

where $\omega {({\theta }_{i,j})}^{k}$ is the autocorrelation function measured with the kth area removed, and $\overline{\omega ({\theta }_{i})}$ is the average autocorrelation function of the jackknife realizations. In practice, we divide our sample into Nsub = 13 nearly equally sized stripe-shaped subareas. We verify that the shapes of the subarea do not affect the jackknife results significantly. To determine the best-fit power-law model (Equation (3)) for each correlation function, we perform a χ2 minimization, where

The 1σ error is estimated based on finding where Δχ2 = 1 (Avni 1976).

6.2. Dark Matter Halo Mass

To quantify the underlying dark matter halo mass (Mhalo) of our sample, we first need to compute the galaxy bias (b), which can be defined as the square root of the ratio of the two-point correlation function of the galaxies relative to the dark matter,

where the ω(θ)DM is the projected angular two-point correlation function of the dark matter. To reproduce the clustering model in dark matter, we use the HALOFIT code from Smith et al. (2003) with improved fitting formulae provided by Takahashi et al. (2012), which can predict the nonlinear and dimensionless power spectrum of dark matter ${{\rm{\Delta }}}^{2}{(k)={k}^{3}P(k)/(2\pi )}^{2}$ for a wide range of cold dark matter cosmologies. The Fourier transform of the two-point correlation function is the power spectrum. We then project the power spectrum into the angular correlation function by using Limber's equation (Limber 1953; Peebles 1980; Peacock 1991; Baugh & Efstathiou 1993), specifically via the Equation (A6) in Myers et al. (2007). The ω(θ)DM profiles at z = 0.5–1, z = 1–2, and z = 2–3 are shown as dotted curves in Figure 6. We fit a single b parameter from the observed galaxy correlation function and the dark matter correlation function by minimizing χ2 on the scales of 0farcm1–6'. Finally, we convert the b to Mhalo using the ellipsoidal collapse model of Sheth et al. (2001).

In the case of the small-angle approximation (θ ≪ 1 rad) and assuming no clustering evolution over the redshift bin, we can de-project the angular autocorrelation function to the power spectrum by inverting Limber's equation (e.g., Myers et al. 2006; Hickox et al. 2012; see Peebles 1980 for full detivation) and further estimate the clustering scale length (r0) as follows:

Equation (6)

where ${H}_{\gamma }={\rm{\Gamma }}(1/2){\rm{\Gamma }}([\gamma -1]/2)/{\rm{\Gamma }}(\gamma /2)$, Γ is the gamma function, γ = δ + 1 (=1.8 in this work), χ is the radial co-moving distance, dN/dz is the redshift selection function, and ${E}_{z}={H}_{z}/{H}_{0}={dz}/d\chi $ (${H}_{z}^{2}={H}_{0}^{2}[{{\rm{\Omega }}}_{{\rm{m}}}{(1+z)}^{3}+{{\rm{\Omega }}}_{{\rm{\Lambda }}}]$).

According to the formalism of Peebles (1980), r0 is related to b via

Equation (7)

where ${C}_{\gamma }=72/(3-\gamma )(4-\gamma )(6-\gamma ){2}^{\gamma }$, σ8 is the amplitude of matter clustering (= 0.83; Planck Collaboration et al. 2014), and Δ8 is the clustering strength of dark matter haloes more massive than stellar mass M at redshift z, which can be defined as ${{\rm{\Delta }}}_{8}=b(M,z){\sigma }_{8}D(z)$. The function D(z) is the growth factor of linear fluctuations in the dark matter distribution, which can be computed from

In our subsequent analysis, the redshift value is assumed to be the median of the distribution of the sources. The results are summarized in Table 3.

6.3. Clustering Signals

Figure 7 shows the values of r0 as a function of redshift for our SMG candidates and for comparison samples. We also plot the measurements from the literature for 24 μm–selected galaxies (Dolley et al. 2014; Solarz et al. 2015), 100 μm–selected Herschel SMGs (Magliocchetti et al. 2013), 250 μm–selected Herschel sources (Amvrosiadis et al. 2019), 850 μm–selected SMGs (Webb et al. 2003; Blain et al. 2004; Weiß et al. 2009; Williams et al. 2011; Hickox et al. 2012; Chen et al. 2016; Wilkinson et al. 2017; An et al. 2019), and quasars (Myers et al. 2006; Porciani & Norberg 2006; Shen et al. 2007; Eftekharzadeh et al. 2015). The measured r0 (or b) of our SMG candidates and the comparison samples decline with decreasing redshift (Table 3). This trend is expected, since dark matter clustering evolves rapidly as the universe evolves with time (dotted curves in Figure 6), resulting in a decrease of the measured value of r0 (or b) of a biased population. The preceding analysis only illustrates the clustering signals of our sample, so combining the knowledge of clustering signals and halo masses of our samples is more meaningful. Our SMG candidates reside in a halo with a typical mass of ≃(2.0 ± 0.5) × 1013 h−1 M across the redshift range 0.5 < z < 3. In general, passive galaxies have stronger clustering signals than the comparison star-forming galaxies and SMG candidates, indicating that passive galaxies preferentially reside in more massive halos compared to star-forming galaxies and SMG candidates at fixed stellar mass and redshift (see also Hartley et al. 2008, 2010; McCracken et al. 2010; Lin et al. 2012, 2016; Sato et al. 2014; Ji et al. 2018). On the other hand, there is no significant difference between the SMG candidates and the comparison star-forming galaxies for the same stellar-mass cut, suggesting that these two populations reside in similar mass halos. Similar trends can also be found in earlier studies of galaxy samples at 1.5 < z < 2.5 (Béthermin et al. 2014) and at 1 < z < 5 (Chen et al. 2016; An et al. 2019). This result implies that merging events may not be the only triggering mechanism for SMGs, since we expect that more biased regions will result in higher merging and/or interaction rates (Lin et al. 2010; de Ravel et al. 2011; Sobral et al. 2011). However, further studies are still needed to address the question of which mechanisms increase the dust obscuration within galaxies.

Figure 7.

Figure 7. Redshift evolution of the clustering length r0 for our SMG candidates and the comparison samples. The data points are slightly offset horizontally, to avoid overlap. We find no evidence that the clustering signal of SMG candidates exhibits an evolution with redshift. SMG candidates reside in a typical halo mass of ≃(2.0 ± 0.5) × 1013 h−1 M across the redshift range of 0.5 < z < 3. We also show the estimated r0 of 24 μm–selected galaxies (Dolley et al. 2014; Solarz et al. 2015), SMGs (Webb et al. 2003; Blain et al. 2004; Weiß et al. 2009; Williams et al. 2011; Hickox et al. 2012; Magliocchetti et al. 2013; Chen et al. 2016; Wilkinson et al. 2017; Amvrosiadis et al. 2019; An et al. 2019), and quasars (Myers et al. 2006; Porciani & Norberg 2006; Shen et al. 2007; Eftekharzadeh et al. 2015) in literature for comparison.

Standard image High-resolution image

At z = 1–3, the clustering measured for our SMG candidates is in broad agreement with that of previous studies, except that Wilkinson et al. (2017) found a weaker clustering strength at z ≃ 1–2, although, with large uncertainties (so not significantly different from our results). We find no strong evidence that the halo masses of SMG candidates exhibit any evolution with redshift. This is in agreement with what was found in Chen et al. (2016), Amvrosiadis et al. (2019), and An et al. (2019) but is contrary to what was suggested in Wilkinson et al. (2017), in which they found that SMG activity seems to be shifting to less massive halos, consistent with an early downsizing scenario. We attribute this to their large measurement uncertainties or the different methodologies that are adopted in the clustering analyses. We note, however, that the term "downsizing" is a relative notion. While we do not observe a significant decrease of halo masses of SMGs with decreasing redshifts, which is why we say we do not observe downsizing, we are ultimately comparing halo masses at different redshifts. Given the same mass, halos at higher redshifts are expected to grow into more massive halos at the present day. From this perspective, our results support the downsizing scenario such that the SMGs at lower redshifts are formed within halos that are, on average, smaller in mass at the present day.

Our result provides a meaningful constraint on the clustering amplitude of SMGs at z ≃ 0.5–1, a key redshift range where downsizing effects are expected to take place. At z ≃ 0.5–1, the clustering signal of our SMG candidates appears to be a little higher than in the previous studies of Magliocchetti et al. (2013), Dolley et al. (2014), and Solarz et al. (2015); although, there are large uncertainties. It is worth noting that the majority of the sources in the aforementioned studies represent a fainter population with LIR ≃ 1011 L. On the other hand, our SMG candidates are bright at 450 μm (≃5 mJy; see Section 4), which corresponds to LIR ≃ 1012 L at z = 0.5–1 if we convert the 450 μm flux density into the total LIR by using the average ALESS 870 μm SEDs (da Cunha et al. 2015). Therefore, a stronger clustering signal is expected in our sample, since the galaxies with higher LIR tend to have stronger clustering (Dolley et al. 2014; Toba et al. 2017).

By comparing the z < 1 measurements and those at z > 1, along with the quasars, there is a clear trend for clustering length to increase with redshift at earlier epochs, which is consistent with downsizing behavior. The question now arises: are the SMGs at z < 1 the same dusty population as those at z > 1? The dusty star-forming galaxy population seems to have lower LIR at low redshifts (e.g., Magliocchetti et al. 2013; Dolley et al. 2014; Solarz et al. 2015), which prevents us from making a direct comparison with the SMGs at high redshifts. Nevertheless, our understanding of the clustering properties in SMGs is still far from complete. The study of the redshift evolution of the clustering properties in SMGs with comparable LIR at a similar wavelength is required to solve this problem. We expect this situation to be improved with next-generation submillimeter telescopes, such as Atacama Large Aperture Submm/mm Telescope (AtLAST; Klaassen et al. 2019). A future AtLAST survey with increased sensitivity at 350 μm, and a larger field of view (≃1°) will detect fainter populations (LIR ≃ 1011 L) at larger scales.

7. Summary

By combining SCUBA-2 data from the ongoing JCMT Large Program STUDIES and the archive in the CANDELS/COSMOS field, we have obtained an extremely deep 450 μm image (1σ = 0.56 mJy beam−1) covering ≃300 arcmin2: by far the deepest image ever observed at 450 μm. We obtain a sample of 221 SMGs from this image; however, the sample size is too small to meaningfully study the redshift evolution of the clustering of the population.

We select a robust (S/N ≥ 4) and flux-limited (≥4 mJy) sample of 164 SMGs that have K-band counterparts in the COSMOS2015 catalog identified based on radio or mid-infrared imaging, which allows us to employ their optical and near-infrared colors. Ultilizing this SMG sample and the 4705 K-band-selected non-SMGs lying within the ≤1 mJy beam−1 noise level region of the 450 μm image as a training set, we develop a machine-learning classifier to identify SMG candidates in the full COSMOS field. We employ the K-band magnitudes and color–color pairs based on the 13-broadband photometry measurements (${{uBVri}}^{+}{z}^{++}{JHK}$[3.6][4.5][5.8][8.0]) available in this field for the machine-learning algorithm. Our main findings are as follows:

  • 1.  
    Our trained classifier labels 6182 SMG candidates in the wider COSMOS field from the COSMOS2015 catalog with mK < 24.5 magAB across an effective area of 1.6 deg2.
  • 2.  
    The number density, VLA 3 GHz and/or MIPS 24 μm detection rates, redshift and stellar-mass distributions, and the stacked 450 μm flux densities of the SMG candidates across the COSMOS field agree with the measurements made in the much smaller CANDELS field, all supporting the effectiveness of the classifier. The high completeness (76% ± 7%) and precision (82% ± 7%) of our SMG candidates as judged from their detection in longer-wavelength ALMA observations further supports our machine-learning algorithm.
  • 3.  
    We found that the SMG candidates tend to have higher reddening compared to comparison star-forming and passive galaxies that are matched in redshift and stellar mass. The SMG candidates also have younger stellar population ages than the comparison star-forming and passive galaxies, even though their overall distribution of stellar masses is similar. This suggests that the SMG candidates may have more recent star formation and, consequently, have a higher proportion of young stars.
  • 4.  
    The SMG candidates have a median effective radius of ${5.5}_{-0.4}^{+0.3}$ kpc and a median Sérsic index of 1.1 ± 0.1. These measurements are consistent with previous studies within the uncertainties. Our results show that SMG candidates are significantly more extended and more disk-like than the comparison star-forming and passive galaxies.
  • 5.  
    We measured the two-point autocorrelation function of the SMG candidates from z = 3 down to z = 0.5 and found that they reside in halos with masses of ≃(2.0 ± 0.5) × 1013 h−1 M across this redshift range. However, we do not find evidence of downsizing that has been suggested by other recent observational studies.

We thank the JCMT/EAO staff for observational support and the data/survey management and the anonymous referee for comments that significantly improved the manuscript. C.C.C. acknowledges support from the Ministry of Science and Technology of Taiwan (MOST 109-2112-M-001-016-MY3). C.F.L. and C.C.C. acknowledge grant support from SSDF-(ST)18/27/E. C.F.L., W.H.W., and Y.Y.C. acknowledge grant support from the Ministry of Science and Technology of Taiwan (105-2112-M-001-029-MY3 and 108-2112-M-001-014-). I.R.S. acknowledges support from STFC (ST/P000541/1). L.C.H. acknowledges support from the National Science Foundation of China (11721303 and 11991052) and National Key R&D Program of China (2016YFA0400702). M.J.M. acknowledges the support of the National Science Centre, Poland through the SONATA BIS grant 2018/30/E/ST9/00208. M.P.K. acknowledges support from the First TEAM grant of the Foundation for Polish Science No. POIR.04.04.00-00-5D21/18-00. S.E.H. is supported by Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education (2018R1A6A1A06024977). Y.A. acknowledges financial support by NSFC grant 11933011. Y.G.'s research is supported by National Key Basic Research and Development Program of China (grant No. 2017YFA0402704), National Natural Science Foundation of China (grant Nos. 11861131007 and 11420101002), and Chinese Academy of Sciences Key Research Program of Frontier Sciences (grant No. QYZDJSSW-SLH008).

The James Clerk Maxwell Telescope is operated by the East Asian Observatory on behalf of the National Astronomical Observatory of Japan; the Academia Sinica Institute of Astronomy and Astrophysics; the Korea Astronomy and Space Science Institute; and the Operation, Maintenance and Upgrading Fund for Astronomical Telescopes and Facility Instruments, budgeted from the Ministry of Finance (MOF) of China and administrated by the Chinese Academy of Sciences (CAS), as well as the National Key R&D Program of China (No. 2017YFA0402700). Additional funding support is provided by the Science and Technology Facilities Council of the United Kingdom and participating universities in the United Kingdom and Canada. The submillimeter observations used in this work include the S2COSMOS program (program code M16AL002), STUDIES program (program code M16AL006), S2CLS program (program code MJLSC01) and the PI program of (Casey et al. 2013, program codes M11BH11A, M12AH11A, and M12BH21A).

Software: GALFIT (Peng et al. 2010), LE PHARE (Arnouts et al. 1999; Ilbert et al. 2006), scikit-learn (Pedregosa et al. 2011), Scipy (Jones et al. 2001), PICARD (Jenness et al. 2008), SExtractor (Bertin & Arnouts 1996), SMURF (Chapin et al. 2013).

Appendix: Clustering Results of SMG Candidates Identified by Other Algorithms

We adopt the SMG candidates that are identified by a decision tree (better recall) and random forest (better precision), and we estimate their clustering signals by using the same procedures as in Section 6. The results are shown in Table A1. The decision tree and random forest label 14440 and 1866 SMG candidates, respectively, from the COSMOS2015 catalog at redshift 0.5 < z < 3.0. The larger number of SMG candidates found using the decision tree algorithm is expected, since the higher recall will select more sources, but as a tradeoff, the precision decreases. In contrast, the situation is reserved for the random forest algorithm. The stacked fluxes of the SMG candidates identified by the decision tree and random forest methods are (3.0 ± 0.1) mJy and (6.4 ± 0.3) mJy, respectively. As expected, we find a lower stacked flux with the decision tree and a higher one with the random forest, essentially reflecting their precision. It is worth noting that the random forest algorithm only labels ∼100 SMG candidates in the redshift bin of 0.5 < z < 1.0, and consequently, we do not have sufficient data points for the clustering analyses in this redshift bin. Nevertheless, the clustering signals do not show significant differences from the results of XGBoost-identified SMG candidates (Table 3), indicating that our final results do not strongly depend upon the method we choose.

Table A1.  Clustering Results of SMG Candidates Identified by Decision Tree and Random Forest Algorithms

Sample Redshift Nsa b r0 log(Mhalo)
        (h−1 Mpc) (h−1 M)
SMG candidates (Decision Tree) $0.5\lt z\lt 1.0$ 2406 ${2.4}_{-0.2}^{+0.2}$ ${8.1}_{-0.8}^{+0.7}$ ${13.4}_{-0.1}^{+0.1}$
  1.0 < z < 2.0 8919 ${3.2}_{-0.2}^{+0.2}$ ${8.0}_{-0.5}^{+0.5}$ ${13.1}_{-0.1}^{+0.1}$
  2.0 < z < 3.0 3115 ${6.2}_{-0.5}^{+0.5}$ ${11.5}_{-1.0}^{+1.0}$ ${13.3}_{-0.1}^{+0.1}$
SMG candidates (Random Forest) 0.5 < z < 1.0 143 ... ... ...
  1.0 < z < 2.0 730 ${5.3}_{-1.3}^{+1.0}$ ${13.8}_{-3.6}^{+3.0}$ ${13.8}_{-0.4}^{+0.2}$
  2.0 < z < 3.0 993 ${6.0}_{-1.7}^{+1.3}$ ${11.0}_{-3.5}^{+2.8}$ ${13.2}_{-0.5}^{+0.3}$

Note.

aSample sizes of our samples in the corresponding redshift bins.

Download table as:  ASCIITypeset image

Please wait… references are loading.
10.3847/1538-4357/ab8eaf