Infrared Spectral Energy Distributions and Dust Masses of Sub-solar Metallicity Galaxies at z ∼ 2.3

We present results from Atacama Large Millimeter/submillimeter Array (ALMA) 1.2 mm continuum observations of a sample of 27 star-forming galaxies at z = 2.1–2.5 from the MOSFIRE Deep Evolution Field survey with metallicity and star formation rate measurements from optical emission lines. Using stacks of Spitzer, Herschel, and ALMA photometry (rest frame ∼8–400 μm), we examine the infrared (IR) spectral energy distributions (SED) of z ∼ 2.3 subsolar-metallicity (∼0.5 Z ⊙) luminous infrared galaxies (LIRGs). We find that the data agree well with an average template of higher-luminosity local low-metallicity dwarf galaxies (reduced χ 2 = 1.8). When compared with the commonly used templates for solar-metallicity local galaxies or high-redshift LIRGs and ultraluminous IR galaxies, even in the most favorable case (with reduced χ 2 = 2.8), the templates are rejected at >98% confidence. The broader and hotter IR SED of both the local dwarfs and high-redshift subsolar-metallicity galaxies may result from different grain properties or a harder/more intense ionizing radiation field that increases the dust temperature. The obscured star formation rate (SFR) indicated by the far-IR emission of the subsolar-metallicity galaxies is only ∼60% of the total SFR, considerably lower than that of the local LIRGs with ∼96%–97% obscured fractions. Due to the evolving IR SED shape, the local LIRG templates fit to mid-IR data overestimate the Rayleigh–Jeans tail measurements by a factor of 2–20. These templates underestimate IR luminosities if fit to the observed ALMA fluxes by >0.4 dex. At a given stellar mass or metallicity, dust masses at z ∼ 2.3 are an order of magnitude higher than z ∼ 0. Given the predicted molecular gas fractions, the observed z ∼ 2.3 dust-to-stellar mass ratios suggest lower dust-to-molecular gas masses than in local galaxies with similar metallicities.


Introduction
The infrared (IR) emission of dust in galaxies accounts for a significant fraction of their bolometric luminosity and encodes critical clues to how it is produced. By mass, dust only represents ∼1% of the interstellar medium (ISM) in typical galaxies. However, it reshapes galaxy spectral energy distributions (SEDs) by attenuating and absorbing UV-optical photons and reradiating that energy in the IR. The resulting IR emission accounts for approximately half of the cosmic extragalactic background (Dole et al. 2006;Finke et al. 2010), and the bulk of the cosmic star formation at z ∼ 0-3 is detected in the IR (Madau & Dickinson 2014;Planck Collaboration et al. 2014;Casey et al. 2018).
IR SEDs consist of a roughly Planckian and featureless far-IR (FIR) component plus emission features of aromatic molecules in the ∼6-20 μm range. While the mid-IR spectra (λ = 5-30 μm) are dominated by the emission from small grains that are stochastically heated by single photons, the longer wavelength FIR and submillimeter emission comes from larger grains that are in thermal equilibrium. The shape of the FIR/submillimeter SED depends on the dust composition (which determines the submillimeter spectral slope), the distribution of radiation field intensities on the dust, and the dust grain size distribution, which together determine the peak and width of the IR SED. In a comprehensive study of local galaxies, Rémy-Ruyer et al. (2015) showed that, while there are many commonalities, distinct differences exist in the IR SEDs of low-metallicity dwarfs and metal-rich star-forming local galaxies. They found that, on average, the low-metallicity galaxies have broader IR SEDs that peak at shorter wavelengths compared to those of metal-rich galaxies. Rémy-Ruyer et al. (2015) attributed the differences in the IR SEDs to a wider range of interstellar radiation field intensities (σU), with a higher average radiation field intensity (〈U〉 15 ) in low-metallicity dwarfs due to their high specific star formation rates (sSFRs; sSFR = SFR/M * ). In a sample of local galaxies with oxygen abundances of + ( ) 12 log O H ∼ 8.2-8.8, Cortese et al. (2014) also found a strong correlation between the wavelength-dependent emissivity index of the dust (β) and metallicity, with only a weaker anticorrelation with the sSFR. These results illustrate that the effects of metallicity and of the radiation field on the integrated IR SEDs of galaxies cannot be easily separated, as the two parameters themselves are interconnected. Low-metallicity systems are often dominated by young stellar populations that have relatively hard and intense radiation fields. They also have lower dust-to-gas mass ratios (e.g., Draine et al. 2007b;Rémy-Ruyer et al. 2014) that make their ISM more transparent to the stellar radiation, allowing massive star formation to impact the ISM over a larger volume. These effects make the ISM metallicity a good tracer of dust evolution processes in galaxies, as it traces both the elemental abundances in the ISM and the interstellar radiation field intensity that the dust grains are exposed to.
However, there are very few studies on the effect of metallicity on dust emission properties outside the local universe, due both to the sensitivity limitations of the available IR-submillimeter data and to the lack of robust metallicity measurements. Recent studies have shown that, at a given stellar continuum reddening (UV slope), the IR to UV luminosity ratio (IRX) of z ∼ 2 star-forming galaxies correlates with the metallicity in the range of + ( ) 12 log O H ∼ 8.3-8.6 (Shivaei et al. 2020a) or with the stellar mass (Reddy et al. 2018a;Fudamoto et al. 2020). This may indicate a change in the grain properties or dust-star geometry with metallicity. Moreover, in Shivaei et al. (2017) we showed that the mid-IR aromatic (PAH) emission relative to the total IR luminosity of z ∼ 2 galaxies decreases at + ( ) 12 log O H  8. 3-8.4, similar to the behavior seen in the local universe but at + ( ) 12 log O H 8.2 (Engelbracht et al. 2005;Draine et al. 2007b;Hunt et al. 2010;Marble et al. 2010; Li 2020, among many more). These studies suggest that the emission properties of dust at z ∼ 2 vary significantly between solar and subsolarmetallicity galaxies. The main goal of this work is to investigate the FIR emission of z ∼ 2.3 subsolar-metallicity galaxies and assess whether the broadening of the FIR SED seen locally in low-metallicity galaxies occurs at subsolarmetallicities at high redshifts. Such behavior, if confirmed, would provide valuable insight into the dust and gas properties of high-redshift galaxies. Additionally, it will have implications for the calibration of the James Webb Space Telescope (JWST), Atacama Large Millimeter/submillimeter Array (ALMA), and other IR/submillimeter measurements interpreted as the star formation rate (SFR), IR luminosity, and dust mass indicators. To reach this goal deep observations across the wavelengths from mid-IR to submillimeter are required to detect the IR emission of the lower-luminosity, and hence lower-metallicity, galaxies at high redshifts.
In this work, we present the first results of a targeted ALMA band-6 (1.2 mm) continuum survey, tracing rest-frame submillimeter emission of a sample of 27 star-forming galaxies at z = 2.1-2.5 drawn from the MOSFIRE Deep Evolution Field (MOSDEF) survey . The galaxies have robust metallicity and SFR measurements from optical emission lines (Hα, Hβ, [O III], [N II]) and span a wide range in oxygen abundance from + ( ) 12 log O H = 8.1 to 8.8. The availability of this detailed prior information enables this targeted survey of faint subsolar-metallicity galaxies with ALMA. Compared to the local galaxies with similar metallicities, the z ∼ 2.3 galaxies have higher SFRs and SFR surface densities, which make this analysis unique.
ALMA band-6 traces the Rayleigh-Jeans (RJ) tail of the dust emission at rest frame ∼360 μm for our sample. Combining the ALMA data with the shorter-wavelength Spitzer and Herschel data at rest frame 7-100 μm, we can test whether the relatively broad FIR SEDs seen in local low-metallicity galaxies are also typical at z > 1. By constraining the change in IR/submillimeter SEDs with the metallicity, we provide insight into the integrated dust properties of high-redshift galaxies (such as IR luminosity and obscured SFR), determine the masses of dust, study the dust mass fraction (dust-to-stellar mass) versus metal fraction (aka metallicity) at z ∼ 2.3, and explore its redshift evolution by incorporating various z ∼ 0 surveys from the literature. Our results extend over 0.7 dex in metallicity and down to stellar masses of ∼ 10 9.7 M e , exploring a new parameter space for galaxy evolution studies at these redshifts.
The outline of the paper is as follows. In Section 2, we present the sample and data and explain the data reduction. Section 3 presents the analysis and construction of IR SEDs. Section 4 discusses outcome of the IR SED analysis, how it changes with metallicity and star formation properties, and its physical interpretation. In Section 5, we discuss the implications for measuring the IR luminosity and SFR, as well as predicting submillimeter fluxes based on shorter-wavelength data. The evolution of dust mass fraction versus redshift and metallicity, and the comparison of dust-to-gas mass ratios with local galaxies are covered in Section 5.4. Our results are summarized in Section 6. We discuss the different models and IR templates that are used to fit the data in Appendix A. The systematic uncertainties in calculating dust masses from submillimeter fluxes are discussed in Appendix B. A cosmology with H 0 = 70 km s −1 Mpc −1 , Ω Λ = 0.7, and Ω m = 0.3, and a Chabrier (2003) initial mass function (IMF) are adopted.

Sample and Observations
Our sample is selected from the MOSDEF near-IR spectroscopic survey . MOSDEF is a survey of ∼1500 galaxies at z = 1.3-3.8 with near-IR spectra from the MOSFIRE spectrometer on Keck, primarily in the Cosmic Evolution Survey (COSMOS), Great Observatories Origins Deep Survey (GOODS)-N and All-wavelength Extended Groth Strip International Survey fields (a small fraction of the MOSDEF galaxies is located in the GOODS-S and UltraDeep Survey fields). Using the catalogs of the 3D Hubble Space Telescope (HST) survey (Skelton et al. 2014), the MOSDEF parent sample was selected based on the available photometric and spectroscopic redshifts in three redshift windows of z = 1.4-1.7, z = 2.1-2.6, and z = 3.0-3.8 down to H-band (AB) magnitudes of 24.0, 24.5, and 25.0, respectively. In these redshift ranges, most of the prominent optical emission lines are observed in the YJHK bands of MOSFIRE.

Sample Characteristics
The ALMA survey presented here (program 2019.1.01142. S; PI: I. Shivaei) targeted MOSDEF galaxies in the COSMOS field for accessibility with ALMA. There are 629 MOSDEFtargeted galaxies in this field, of which 209 are within the correct redshift range (2.0 < z < 2.5), such that the ALMA 1.2 mm and Spitzer Multiband Imaging Photometer (MIPS) 24 μm spectral bands trace the cold dust and 7.7 μm PAH emission, respectively. We further required (1) >3σ sigma detection in the Hα, Hβ, [O III], and [N II] lines to ensure robust estimates of SFR and metallicity Sanders et al. 2015;Shapley et al. 2015), which narrowed down the sample to 51; (2) unconfused MIPS 24 μm photometry (Shivaei et al. 2017); (3) no evidence of active galactic nuclei (AGN), based on X-ray emission, Infrared Array Camera (IRAC) colors, and [N II]/Hα line ratios (Coil et al. 2015;Azadi et al. 2017Azadi et al. , 2018Leung et al. 2019); and (4) rest-frame U-V and V-J colors consistent those of star-forming galaxies (Zick et al. 2018). Of the 40 galaxies remaining, we select the final sample of 27 to represent a wide range in mass and metallicity. The mass, metallicity, and sSFR distributions of galaxies in our sample are shown in Figure 1.
The MOSDEF parent sample is largely representative of typical main-sequence star-forming galaxies (e.g., Shivaei et al. 2015b). However, as the parent sample is selected based on rest-frame optical photometry, there is a bias toward galaxies with redder UV and optical colors at a given UV magnitude compared to UV-selected samples (Reddy et al. 2018b). On the other hand, the subsample that is selected for ALMA followup in this work is selected based on strong detection in multiple optical emission lines, and may exclude highly dust-obscured galaxies with simple dust-star geometries. 16 The well-measured emission lines allow use of the Balmer decrement (Hα to Hβ flux ratio) to derive corrections for reddening of the nebular lines to recover the total (intrinsic) nebular emission. The success of this approach for a Hα-Hβ detected sample is illustrated in Shivaei et al. (2016) who compare Hα SFR estimates corrected for reddening via the Balmer decrement with IR-based SFR estimates.

ALMA Data
The ALMA continuum observations in band 6 (with a representative frequency of 242 GHz) were executed in three different blocks with the 12 m array in C43-2 and C43-3 configurations. Different depths were requested for the three blocks that included subsamples with different masses and metallicities, based on the expectation of what would have been detectable. The high-mass/high-metallicity bin, M * > 10 10 M e and + > ( ) 12 log O H 8.35

PP04
(where PP04 refers to the strong line metallicity calibrations of Pettini & Pagel (2004); see Section 2.4), has 13 targets with average rms = 26.0 μJy beam −1 in the centers of the images (where the targets are located). The high-mass/low-metallicity bin,  , has 7 targets with average rms = 15.4 μJy beam −1 . The low-mass/lowmetallicity bin, M * < 10 10 M e and + < ( ) 12 log O H PP04 8.35, has 7 targets with average rms = 9.8 μJy beam −1 . 17 Due to the COVID-19 pandemic, the observations' progress on the low-mass/low-metallicity bin has been delayed, and 94% of the allocated integration time distributed over all sources in this bin was taken. However, as the available data reach close to the desired rms, we proceed with the analysis using the existing data.
Data are calibrated and imaged with the Common Astronomy Software Applications (CASA) package (McMullin et al. 2007). We create a cleaned image using the TCLEAN package with natural weighting and the multifrequency synthesis (mfs) mode, and we apply tapering, which by suppressing the weights of the outer visibilities increases the beam size (lowers the resolution). The reason for using tapering is the effect of the non-Gaussian nature of the dirty beam on flux estimation (see Czekala et al. 2021 and the appendix in Jorsater & van Moorsel 1995 for a complete discussion). In brief, in the TCLEAN process, the clean model is convolved with the clean beam and has units of Jy/ clean beam, while the dirty image and the residual map that are created after running the TCLEAN task are in units of Jy/dirty beam. If the dirty beam is a Gaussian, the two beams would have roughly the same area. However, if the dirty beam has shelves, as in the case of our data, the integrated dirty beam response would be larger than the integrated clean beam response. Therefore, in the latter case, once the residual map is added back to the convolved clean model, the fluxes will be overestimated. This effect is more pronounced for low signal-to-noise ratio (S/N) data, like ours, as the fraction of the flux in the residual map is higher. The solution can be either to convolve the clean model with a larger beam that matches the area of the dirty beam, or to scale the residuals following the prescription of Jorsater & van Moorsel (1995). The former sacrifices the resolution for correct flux and noise estimates. The latter retains the resolution; however it scales down the noise outside the regions containing emission, resulting in an incorrect noise determination (Walter et al. 2008). Given that for this analysis we are not concerned about the resolution, we adopt the first solution by using tapered images, yielding a synthesized beam with FWHM of 1 4 × 1 4. In our sample, the fluxes estimated from the untapered natural-weighted images are higher by ∼15% than the fluxes derived from the tapered (unaffected) images. This value is in agreement with the flux overestimate percentage calculated using the prescription of Jorsater & van Moorsel (1995).

Flux Measurements
Fluxes are extracted through aperture photometry, performed on the primary-beam-corrected images. We convert the image units from Jy beam -1 to Jy pixel -1 using the number of pixels per beam (calculated from the image header keywords; BMAJOR, BMINOR, CDELT1, CDELT2). Then, the total integrated aperture flux is calculated as the sum of the pixel values in a given aperture. An aperture radius of 1 45 is used to include >99% of the point-source Gaussian area (1 45 is 2.1 times the half width at half maximum of the Gaussian beam). The aperture fluxes are cross-checked for consistency with those derived from 2D fitting on tapered images through CASA. We do not attempt to derive fluxes from the peak signal, as peak fluxes are highly uncertain for low S/N objects such as the ones in our sample.
Since the targets are all at the center of the images, the nonprimary beam-corrected maps are used for error measurements Figure 2. Stacks of Spitzer/MIPS 24 μm, Herschel/PACS 100 and 160 μm, SPIRE 250 μm, and ALMA band-6 (1200 μm) images for the subsolar (top row) and solar (bottom row) metallicity bins described in Table 2. The axes show the scale of the images in arcseconds. The integrated flux densities ( f ν ) and background noise estimated errors are listed for each stacked image. Refer to Section 3.1 for details. The black circle in each image shows the FWHM of the respective PSF centered at the the subimage. 17 We initially adopted three mass-metallicity bins for sensitivity calculations; however, later in Sections 3 and 4, we simply divide the full sample into two subsolar-and solar-metallicity bins to increase the S/N in the Spitzer and Herschel stacks. We go back to three bins in Section 5.4 again, as the analysis is based on the ALMA data alone.
(e.g., as in Betti et al. 2019). The noise for the integrated flux measurements is estimated by taking the standard deviation of the integrated flux measurements in 100 apertures of the same size and offset randomly from the source position.
Given the prior knowledge on the location of the galaxies in HST/F160W images, 18 a detection is defined as S/N > 2 in the integrated aperture flux measurements. According to this criterion, 10 out of 27 objects in the sample are detected. Despite the detections for 40% of the sample, for this work we rely on image stacking for two main reasons: (1) the galaxies are not individually detected in Herschel (and in some cases in Spitzer) images, and hence stacking is required to extract the Herschel and Spitzer average fluxes, and (2) the IR SEDs of the detected objects may be biased relative to the underlying population, while stacking better represents the average behavior of the population. We start with the primary beam-corrected tapered images in Jy pixel -1 units with the sources at the center of subimages according to their optical coordinates. The optical coordinates are used for stacking, given that the targets are relatively faint and are unresolved in ALMA. The stacks are constructed by taking the average flux values in each pixel. We measure stack fluxes using aperture photometry with aperture radius of 1 45, as described in the previous section. The background and noise are measured in non-primary beam-corrected image stacks, in a similar manner as for individual galaxies.
The emission of individual objects is corrected for different redshifts prior to stacking (K-correction). Following Scoville et al. (2016;Equation (6)), we use the spectral slope of the Planck function with a temperature of 35 K and calculate the correction factor at the redshift of each galaxy relative to the average redshift of the sample. Assuming a different temperature does not change the results significantly, as the correction factors are 5% owing to the narrow redshift range of this sample.

Spitzer and Herschel Data
We use Spitzer MIPS 24 μm (Sanders et al. 2007) and Herschel Photodetector Array Camera & Spectrometer (PACS) 100 and 160 μm and Spectral and Photometric Imaging Receiver (SPIRE) 250 and 350 μm images (Lutz et al. 2011;Oliver et al. 2012) in this work to construct IR stacks, as the majority of our galaxies are not individually detected in any of these bands. Stacks are constructed following standard methods (e.g., Zheng et al. 2006;Reddy et al. 2010;Shivaei et al. 2017), as explained below. The S/N in these data sets is background-limited, and the dominant source of noise is source confusion. Our stacking approach is designed to extract accurate values using proven approaches to minimize this noise source, as described below.
All images are converted to Jy pixel -1 . For each target, a subimage with the target in the center is constructed. The avoidance of regions near bright sources can significantly reduce confusion noise (Leiton et al. 2015). Therefore, the images at 24 and 100 μm are inspected to ensure there are no bright sources near the target galaxy that could add noise even after nominal removal (no cases were found). The subimage is then cleaned of neighboring sources as follows, to provide an optimum smooth background for measuring the stacked flux. Prior lists for MIPS 24 μm sources in the field are generated based on galaxies that are detected at S/N > 3 in the IRAC channels 1 and 2. For Herschel PACS images, we use a list of priors with S/N > 3 in the MIPS 24 μm data. Magnelli et al. (2013) show that at this depth (∼20-25 μJy), virtually full identification of the PACS sources will be achieved. For the SPIRE data, we tested priors at S/N of >3 and >5, and settled on the latter value because the density of 3σ priors made the removal ambiguous, given the large beams in these bands and the probability of more than a single prior within a beam area. The neighboring sources in each subimage are removed by simultaneously fitting scaled PSFs to all the prior sources and to the target, and removing all but the target (which is usually not detected). These cleaned subimages are aligned and then used to construct 3σ trimmed average stacks. 20 Stack fluxes are derived by summing the pixel values Note. Columns left to right: 3D-HST v4 catalog ID, R.A. (deg, J2000.0), decl.
(deg, J2000.0), ALMA 1.2 mm flux density, its error, Spitzer 24 μm flux density, and its error. Galaxies below and above the horizontal line are below and above + ( ) 12 log O H = 8.60 (in the B18 calibration), respectively. The Herschel data for the majority of sources are not individually detected; therefore we refrain from listing the individual Herschel measurements. of the stack image within an aperture. Appropriate aperture corrections are calculated for the MIPS photometry based on the 24 μm PSF, and the aperture corrections for the SPIRE and PACS photometry are adopted from the Herschel Legacy Archive, 21 and Balog et al. (2014). respectively. To determine the uncertainty in the stacked fluxes, we measure the flux in 1000 apertures that are randomly positioned in the cleaned subimages away from the center of the image (where the source is) by more than 1 FWHM of the image PSF. The apertures have the same radii as the source aperture radius. The standard deviation of these 1000 fluxes is taken as the stacked flux uncertainty. As this uncertainty is evaluated on the region around the source, it should be a measure of the net uncertainty including confusion noise.
We now compare these results with standard estimates of confusion noise. A stacking approach virtually identical to ours has been simulated at 24 μm by Zheng et al. (2006). That is, they first removed all individually detected sources in the surrounding area for each target, and then stacked the cleaned images. They measured the rms background fluctuations by placing apertures at random places on the stacked images. The rms fluctuations decreased accurately in proportion to the square root of the number of stacked images. They found that 12 images gave a 5σ detection at 24 μJy; scaling to our 21 images for the low-metallicity sample predicts a 7σ result, and we found an S/N of 8 at this flux level, in excellent agreement. At the longer Herschel wavelengths, our method uses the galaxy stack positions as priors; this is important because use of priors can reduce confusion effects by factors of 2-3 (e.g., Rodighiero et al. 2006;Magnelli et al. 2013;Safarzadeh et al. 2015). For the PACS data, we take confusion noise measurements from the deep survey described by Berta et al. (2011), who quote 1σ values of 270 and 920 μJy, respectively, at 100 and 160 μm using 24 μm sources as priors. Magnelli et al. (2013) quote somewhat lower confusion noise limits, but we attribute this difference to the significantly longer integration than used by Berta or us and the resulting modest improvement in PSF fitting in crowded fields. Taking scaling as found by Zheng et al. (2006) with the square root of the number of stacked images, the noise we estimate is consistent for both the high-and low-metallicity samples (although slightly higher than the predicted confusion limit). For the SPIRE data, we start with the confusion noise values from Smith et al. (2017); their values for the nebulized beam should be relevant for our situation. We assume that these values scale inversely with the square root of the number of images stacked, as in Zheng et al. (2006). As these values were determined without priors, there should be some further reduction in our case as discussed above. To determine this gain in our case, we compare the noise predicted from confusion with that from our multiposition measurements. We find that the case of using the 5σ 24 μm priors gives a consistent comparison if we assume an improvement in depth from the use of priors of a factor of 1.4. An exception is for the high-metallicity sample at 350 μm, where the measured noise is about half of the prediction. In any case, the predicted confusion levels are large enough for this band that it has little utility in constraining fits. We therefore have discarded both the high-and low-metallicity stacked results at 350 μm.
To verify the accuracy of the aperture flux measurements on the stacked image of the target galaxies, we introduce fake sources following a similar methodology to that described in Reddy et al. (2012a). We simulate 100 PACS sources with fluxes randomly drawn from a Gaussian distribution centered at the flux values of the real stacks (the width of the distribution is irrelevant for this test). Then, simulated sources in the same number as those that went into the real stack bins are randomly selected for stacking and aperture flux measurement. Based on this input and recovered fluxes, we find that a 3σ trimmed average stack is a more accurate estimator compared to a median stack. The recovered stack fluxes are ∼10% lower than the average input. This correction factor is applied to our stack fluxes. However, the correction is negligible compared to the measurement error.
A final source of uncertainties is the absolute calibration of the various data sets, but these errors are negligible and do not affect our results. For example, the 24 μm absolute calibration of MIPS is accurate to ∼2% or better (Rieke et al. 2008). The Herschel PACS and SPIRE absolute calibrations are accurate to 4%-5% (Bendo et al. 2013;Balog et al. 2014;Müller et al. 2014;Bertincourt et al. 2016). The nominal uncertainty in the absolute calibration of the ALMA observations is 5% (Remjian et al. 2019), but it is likely to be twice as accurate (Farren et al. 2021) particularly given the good conditions of our observations.

Optical and Near-IR Data
The MOSFIRE spectral reduction and line flux extraction are fully described in Kriek et al. (2015), Reddy et al. (2015), and Freeman et al. (2019). In summary, emission line fluxes are measured from the MOSFIRE 1D spectra by fitting Gaussian functions on top of a linear continuum. The uncertainties are derived by perturbing the spectrum of each object according to its error spectrum and measuring the standard deviation of the new realizations. Slit-loss (also called path-loss) corrections are applied by normalizing the spectrum of a "slit star" placed on each MOSFIRE observing mask to match the 3D HST total photometric flux. Additionally, the HST F160W images of the resolved targets were used to estimate and correct for the additional flux lost outside of the slit aperture, relative to the slit star. For details refer to Kriek et al. (2015).

SFRs
SFRs are estimated from Hα luminosities, corrected for dust attenuation using Balmer decrements and the Cardelli et al. (1989) Milky Way (MW) extinction curve. The success of this approach to recover total SFRs for an Hα-Hβ detected sample is demonstrated in Shivaei et al. (2016), where Hα SFR estimates corrected for reddening via the Balmer decrement are compared with IR-based SFR estimates. The assumption of a MW extinction curve to correct the observed nebular emission is supported by previous studies that showed the nebular attenuation curve is similar to the MW extinction curve Rezaee et al. 2021).
Following recent studies that have discussed the metallicity dependence of the Hα luminosity to SFR conversion (e.g., Reddy et al. 2018b;Theios et al. 2019), we adopt two different conversion factors for galaxies with oxygen abundances at roughly the solar value ( + ( ) 12 log O H > 8.6 22 ) and those with lower + ( ) 12 log O H . We assume the stellar and ISM gas metallicity are linearly correlated with each other. For galaxies with + ( ) 12 log O H > 8.6, we adopt the conversion factor of Hao et al. (2011) converted to a (Chabrier 2003 For reference, the Kennicutt (1998) and Kennicutt & Evans (2012) constants to convert a ( ( )) L log H to ( ) log SFR , for the same IMF assumed here, are −41.30 and −41.26, respectively.

Metallicities
The metallicity of the gas is defined as + ( ) 12 log O H . It is common to estimate O/H using ratios of strong optical emission lines and adopting calibrations that have been constructed based on the observations of electron-temperature-sensitive lines (the so called "direct" method) in local galaxies and H II regions (such as those of Pettini & Pagel 2004). At high redshifts, due to the evolving conditions of the ionized gas, the locally calibrated relations may yield biases in the absolute values of the oxygen abundances (e.g., Kewley et al. 2013;Steidel et al. 2014;Shapley et al. 2015;Sanders et al. 2016a;Strom et al. 2018;Kashino et al. 2019). The degree of this bias will not be well known until large and representative samples of high-redshift galaxies with temperature-sensitive auroral lines are constructed to calibrate the strong emission lines for oxygen abundances. Bian et al. (2018) used a sample of local analogs of z ∼ 2 galaxies to derive empirical direct-method calibrations between + ( ) 12 log O H and strong optical line ratios. At the low-metallicity end, their calibrations for the oxygen and hydrogen line ratios agree with the median of 18 z ∼ 1.5-3.5 galaxies with direct-method metallicities from Sanders et al. (2020). Unfortunately, the [N II] line was not available for the z ∼ 1.5 − 3.5 direct-method sample in Sanders et al. (2020) to compare with the N2 In this paper, we use O3N2 line ratios mainly for the practical reason that the four lines are available with high S/N for all of our targets. O3N2 strongly correlates with O/H inferred from strong line ratios involving only oxygen (such as R 23 , which is the ratio of ([O III]λλ4959, 5007 + [O II]λλ3726, 3729) to Hβ), and with stellar mass at z ∼ 2 (e.g., Sanders et al. 2018;Strom et al. 2018). O/H inferred from O3N2 is less biased by N/O variations than when using N2, because the O3N2 spans a wider dynamic range in the line ratios. Additionally, all current calibrations for O3N2 show a nearly linear anticorrelation between the strong line ratios and O/H without any turnovers or plateaus at + ( ) 12 log O H > 8.0 (Pettini & Pagel 2004;Maiolino et al. 2008;Curti et al. 2017;Bian et al. 2018;Sanders et al. 2021). Following the methodology of Sanders et al. (2021), in the small subset (6) Bian et al. (2018) calibrations. These metallicities agree well with the O3N2-derived metallicities. For reference, the O3N2 metallicities in our sample are also tightly correlated with the N2 metallicities, but are on average ∼ 0.05-0.10 dex lower. Given the aforementioned uncertainties in estimating O/H at high redshifts, one should take caution in comparing metallicities derived using different line ratios and calibrations from different studies.
In Figure 1, we show the metallicities derived from both the Bian et al. (2018, hereafter B18) and Pettini & Pagel (2004, hereafter PP04) O3N2 calibrations. These calibrations provide a linear relationship between O3N2 and oxygen abundance, and therefore the ordering of galaxies in oxygen abundance is preserved regardless of which calibration is used. However, the choice of calibration can affect comparisons from one sample to another. As a reference point, the metallicity of + ( ) 12 log O H = 8.35 from the PP04 O3N2 calibration corresponds to + ( ) 12 log O H = 8.51 in the B18 O3N2 calibration. For the ease of comparing with other studies, we refer to the metallicities ( + ( ) 12 log O H ) from both calibrations in the rest of this paper.

Stellar Masses
Stellar masses are derived from SED fitting to the photometry of the 3D-HST survey (Skelton et al. 2014). The photometry is corrected for the nebular emission lines from the MOSFIRE spectra (details in Sanders et al. 2021). We use the FAST SED fitting code (Kriek et al. 2009) with the stellar population model library of Conroy et al. (2009) for a solar stellar metallicity, a Chabrier IMF, delayed exponentially declining star formation history, and the Calzetti et al. (2000) attenuation curve. The stellar masses are insensitive to the choice of the attenuation curve, as masses are dominated by the older stellar populations that emit in longer near-IR wavelengths where different dust attenuation curves are very similar and the total amount of attenuation (A λ ) is relatively small. 25

Analysis
In this section, we discuss our construction of the metallicity subsamples (Section 3.1), and the adopted IR templates and SED fitting procedure (Sections 3.2).

Metallicity Bins and Flux Determination
By design, this study is based on lower-luminosity, and hence fainter, galaxies than have been reached by most FIR/ submillimeter-selected samples at similar redshifts. To construct subsamples that contain enough galaxies to reliably yield a detection in the stacked images of Spitzer and Herschel data, we consider two metallicity bins (instead of the three mass-metallicity 23 To adopt the conversion factors from the literature from Salpeter to Chabrier IMFs, we multiply by a constant factor of 0.63 (Madau & Dickinson 2014). 24 See footnote 9. 25 The choice of attenuation curve in SED fitting can significantly affect the inferred SFRs, as the main difference between various attenuation curves is the curve slope in the UV, where the emission from recent star formation peaks. Studies have shown that low-mass/low-metallicity galaxies at z ∼ 2 have a steep Small Magellanic Cloud-like curve, while massive/metal-rich galaxies have a shallower Calzetti-type curve (Reddy et al. 2018a;Shivaei et al. 2020aShivaei et al. , 2020b. Therefore, in this work we do not use the UV/SED-inferred SFRs. The SFRs are derived from Hα luminosity (Section 2.4.1); hence, the choice of the stellar attenuation curve is irrelevant.
bins that were initially used in the design of the ALMA survey). The bins are selected to represent metallicities of ∼solar and subsolar with average metallicities of + ( ) 12 log O H = 8.71 and 8.41 on the B18 scale or 8.51 and 8.27 on the PP04 scale, respectively. In these two bins, our stacked images yield detections at 24 μm and 1.2 mm at high S/Ns and >2σ detections in one or more additional IR bands. We provide details of the metallicity subsamples and SED fitting in this section.
We first perform a jackknife resampling test to evaluate potential biases caused by outliers. In each metallicity subsample, we estimate stacked 24-1200 μm fluxes by systematically leaving out one object at a time and compare the stacks with those of the full sample (stacking technique described in Sections 2.2.2 and 2.3). We find that the resampled stacks in the subsolar-metallicity bin are all consistent within 1σ (σ being the uncertainty of the stack flux) with the stacks of the full subsolar-metallicity bin. The solar-metallicity bin has a smaller sample size, but except for one object, there is no systematic bias at the >1σ level for more than one photometric band in the resampled stacks. The exception is Object 8280 (Table 1), whose IR emission behavior is significantly different from the rest of the solar-metallicity sample. This galaxy shows an elevated 24-160 μm emission compared to its 250-1200 μm, indicating the presence of warm dust, which can be due to a recent starburst or buried AGN. Object 8280 has a metallicity of + ( ) 12 log O H = -+ 8.66 0.08 0.05 , and its estimated age from SED fitting is ∼50-100 Myr depending on the star formation history assumptions. Additionally, its SED inferred SFR (determined through the rest-frame UV continuum emission) is more than a factor of 2 larger than its Hα-estimated SFR. The young age and the elevated UV to Hα SFR both strengthen the argument that this galaxy has recently undergone a strong starburst. As this peculiar galaxy can bias the results of the stacks, we remove it from the rest of the stacking analysis. The properties of the final subsolar-and solar-metallicity subsamples are listed in Table 2, and their Spitzer, Hershcel, and ALMA stacks (Sections 2.2.2 and 2.3) are shown in Figure 2.

Template Fits
We consider IR templates from the literature that are commonly used. We fit the models by weighted least squares, where the weights are the inverse of the variances of the measurements. As two examples, the Rieke et al. (2009, R09) and the Schreiber et al. (2018, S18) template fits to the solarmetallicity sample are shown in Figure 3. 26 The only free parameter for each template is the normalization. Both of the R09 and S18 templates provide reasonably good fits to the measured IR SEDs for 2 < z < 4 galaxies that are sufficiently luminous to have good detections in multiple FIR/submillimeter bands (e.g., Schreiber et al. 2018;De Rossi et al. 2018). Similarly good fits are also provided by the Magdis et al. (2012) z ∼ 2 template. In agreement with previous studies on massive (and high-metallicity) galaxies at z ∼ 2 (e.g., Elbaz et al. 2011;Reddy et al. 2012b;De Rossi et al. 2018;Shivaei et al. 2018), the photometry of the solar-metallicity galaxies  i Total IR luminosity of the subsolar-and solar-metallicity bins are based on the best-fit local low-metallicity template ( Figure 4) and best-fit R09 LIRG template (Figure 3), respectively. j Obscured SFR estimated from total IR luminosity and calibrations of Kennicutt & Evans (2012). 26 The R09 templates are adopted in the luminosity range recommended by Rujopakarn et al. (2013), which selects the set of R09 models that most closely matches the shape of the IR SED of galaxies at z > 1. These models should be applicable up to a total IR luminosity of ∼4 × 10 12 L e (Shipley et al. 2016), but the lower luminosity limit of applicability is not determined. We use the S18 IR template with temperature of 35 K and PAH fraction of 3%, as suggested by Schreiber et al. (2018) for z ∼ 2 galaxies with stellar masses of ∼10 10.0−11.5 M e .
agrees well with the local luminous infrared galaxy (LIRG) templates.
Owing to the small sample size of the solarmetallicity bin (N = 5), we do not aim to draw conclusions based on the insignificant χ 2 value differences among different fits. That is, we can take the R09 sets of templates as proxies for other fits to the IR SEDs of solar-metallicity galaxies at z ∼ 2. We now will use the stack measurements from Section 2 to test the FIR behavior of subsolar-metallicity galaxies at z ∼ 2.3, and whether it exhibits the relatively broad FIR SED seen for local low-metallicity galaxies (Rémy-Ruyer et al. 2015; Lyu et al. 2016). As a starting point in the interpretation of our low-metallicity measurements, we construct a publicly available 27 IR SED template to represent the average behavior of local lowmetallicity galaxies in a quantitative way, as follows. We take the photometry for the template from the comprehensive and homogeneous results in Rémy-Ruyer et al. (2015). The FIR characteristics of low-metallicity galaxies depend on the luminosity, such that above( ( ) )  L L log IR 9.5, the average SED is significantly warmer than below this value (Rémy-Ruyer et al. 2015). For this reason, we use only the galaxies at and above this threshold for the average template. This results in 11 galaxies. Before averaging their photometry for the FIR SED, we exclude three for the following reasons. Examination of the PACS 70 μm image for the local galaxy UM 311 shows that, within the 115″ aperture used to extract photometry, the signal would have been dominated by the disk of a nearby galaxy, NGC 450 (it has even been suggested that UM 311 is an H II region in this galaxy), which artificially raises its stated luminosity. Moreover, local galaxy HS0052+2536 is much fainter than the rest of the sample, and its SPIRE measurements have too low S/Ns to be useful. Additionally, the Infrared Spectrograph (IRS) spectrum of the local galaxy Mrk 930 is too noisy and the slope of the spectrum conflicts with the photometry. We take mid-IR spectra for all nine (excluding UM 311 and Mrk 930) from Lebouteiller et al. (2011) through the NASA/IPAC Infrared Science Archive (IRSA). We average the measurements (photometry and spectra) in logarithmic space and fit the FIR photometry with polynomial series. We impose a small power-law slope (linear in log space) to make the synthetic photometry on the averaged spectrum match the photometry in IRAC Band 4, while achieving a smooth join to the fit to the FIR photometry at 20 μm. Figure 4 shows the fits to the subsolar-metallicity stacks (N = 21) to the average local low-metallicity template, as well as three of the R09 LIRG templates in the range of = ( ( ) ) - L L log IR 11.0 11.5, as recommended for high-redshift galaxies (De Rossi et al. 2018). None of the LIRG templates provide a good fit to the stacked photometry from 24 to 1200 μm, as they either underestimate the 100 μm emission or overestimate the 150-250 μm emission significantly, showing that the templates are not broad enough to represent the subsolar-metallicity stacks. We also fit other commonly used templates to the stacks to investigate their goodness of the fit parameter (reduced χ 2 ): the template of S18 with T = 35 K and 3% PAH fraction (c = 2.8 ). None of these templates provide reasonable fits to the data, and they are rejected at >98% confidence levels.
In comparison, the average local low-metallicity template better represents the behavior of the z ∼ 2.3 subsolarmetallicity stacks with a broader and warmer FIR emission (solid curve in Figure 4). It also shows a lower reduced χ 2 of 1.8, compared to the other templates mentioned above. The metallicities of the nine dwarf galaxies that are used to build the average local low-metallicity template are in the range of  11.0, 11.25, and 11.5, and the average local low-metallicity dwarf template (Section 3.2) are shown. The reduced χ 2 values (with four degrees of freedom) are also indicated. The observations follow the behavior of the average local low-metallicity template most closely, indicating a broader and warmer IR SED than predicted by the local LIRG templates. 0.1 dex higher metallicity (on the B18 calibration scale; Table 2) with half of the sample having metallicities larger than 8.4, which is the highest metallicity in the dwarf sample. However, the difference between the two strong-line metallicity calibrations adopted here for high-redshift galaxies, the PP04 and B18, introduces a systematic uncertainty of ∼0.1-0.2 dex (Section 2.4.2). In any case, the emergence of a warm dust component makes the IR SED of z ∼ 2.3 subsolarmetallicity galaxies more similar to that of the local lowmetallicity dwarfs than the local LIRGs. We discuss the physical interpretation of this result further in Section 4.2.

The IR SED Shape
In the preceding section, we found that the z ∼ 2.3 sample with subsolar metallicity is fitted better by a template based on local low-metallicity galaxies than by templates based on local galaxies of ∼ solar metallicity. We now discuss the underlying behavior of this change in the IR SED shape.

The Evolution of the Warm Dust Component
To quantify the difference in the warm dust between the subsolar-and solar-metallicity bins, we use the simple twotemperature modified blackbody (2T-MBB) model of (Kirkpatrick et al. 2015, K15) to fit the Herschel and ALMA stacks only. In brief, the model consists of two modified blackbody functions with two different temperatures, which we designate as warm and cold dust components (more details in Appendix A.2). The goal here is to investigate how the warm dust temperature and intensity (luminosity) change between the subsolar-and solar-metallicity samples. Therefore, due to the lack of sufficient observational constraints on the cold dust component, we fix the cold dust temperature and β and evaluate the change in the warm dust component temperature (T w ) and its peak flux compared to that of the cold component. The cold dust temperature is set to 25 K, and the warm component temperature is a free parameter between 30 and 150 K, motivated by the findings of Rémy-Ruyer et al. (2015), who showed that the cold dust temperature is fairly constant at T ∼ 25 K among local galaxies with different metallicities and that the warm dust component can be as high as 150 K in a number of low-metallicity galaxies (see the discussion in Appendix B). We adopt a β of 1.5 for the subsolar-metallicity samples, which is the average submillimeter emissivity index of the local dwarf galaxies (Lyu et al. 2016). For ease of comparison, we also adopt β = 1.5 for the solar-metallicity model. The assumption of β = 2 does not change any of the main results of the solar-metallicity model.
The fits are shown in the top panel of Figure 5, in which the 24 μm data are not included in the fitting procedure. For the solar-metallicity model, we fit the 2T-MBB function to the model photometry at 100, 160, 250, and 1200 μm extracted from the best-fit S18 template (Figure 3). The S18 template is also shown in gray in the solar-metallicity panel of Figure 5. For the subsolar-metallicity model, we fit the 2T-MBB function directly to the subsolar-metallicity stack fluxes at 100, 160, 250, and 1200 μm. The difference between the two metallicity samples is very clear. While the warm and cold dust components have different temperatures in both samples, the two components are easily distinguishable in the subsolarmetallicity bin. This is reflected in the temperature difference between the warm and cold components, as well as their peak flux ratios. The 2T-MBB fit to the synthesized solar-metallicity photometry matches the original IR template very well. The average of the warm (50 K) and cold (25 K) components is the same as the S18 template temperature of 35 K, and the warm component profile dominates the IR peak width with a warmto-cold peak flux ratio of 3. In comparison, the subsolarmetallicity 2T-MBB fit indicates a ΔT=108 ± 33 K temperature difference between the cold and warm components, and comparable peak fluxes. 28 Based on these fits we conclude that (a) the temperature of the warm dust component increases with decreasing metallicity, and (b) while the solar-metallicity IR SED width can be mainly represented by a single MBB, the subsolar-metallicity IR SED is broader as the difference in the temperatures of the two components is larger with similar peak fluxes (i.e., neither of the components dominates in terms of brightness). The solar-metallicity fit has ΔT = 25 ± 3 K, corresponding to l D = -+ 213 12 14 μm, while the subsolarmetallicity fit shows ΔT = 108 ± 33 K, corresponding to l D = -+ 347 16 26 μm, a factor of 1.6 wider. This behavior is similar to that found for local galaxies of subsolar and solar metallicity, e.g., Rémy-Ruyer et al. (2015), and is discussed further below.
A hotter warm component at low metallicities has also been observed both in the z ∼ 0.3 rest-frame UV analogs of z > 5 galaxies (Faisst et al. 2017), and in the local dwarf galaxies of the Dwarf Galaxy Survey (DGS; Rémy-Ruyer et al. 2015). The three low-redshift analogs of z > 5 galaxies in Faisst et al. (2017) with high sSFR and low metallicities similar to our subsolar-metallicity sample show luminosity-weighted temperatures of ∼70-90 K. Faisst et al. (2017) explained the presence of a hot dust component by possibly an optically thin ISM to UV radiation due to the low metallicities, as well as a strong UV radiation field due to high star formation densities. In Rémy-Ruyer et al. (2015), 11 DGS galaxies show an IR excess at λ rest ∼ 20-30 μm. These 11 galaxies have on average 0.25 dex lower + ( ) 12 log O H compared to the average oxygen abundance of the rest of the sample. Rémy-Ruyer et al. (2015) derived dust temperatures in the range of 100-150 K (average of 〈T w 〉 = 117 K) for the warm component, and average of 〈T c 〉 = 31 K for the cold component for those 11 galaxies, similar to the values found for the z ∼ 2.3 subsolar-metallicity sample in this work. Rémy-Ruyer et al. (2015) attributed the warm component to a higher contribution from the hot H II regions heating the dust grains in dwarf galaxies, owing to the smaller physical sizes and lower dust attenuation of the local dwarf galaxies compared to local L * galaxies. This effect can also be explained by the higher sSFR of the dwarf galaxies in their sample, producing a wider equilibrium temperature distribution of dust grains, skewed toward higher dust temperatures (hence a hotter and wider IR SED). However, the sSFR distribution of our subsolar-and solar-metallicity samples are statistically indistinguishable (referring to the sSFR average 28 We caution that the exact value of the warm dust temperature depends on the choice of β and T c . Therefore, the best-fit T w values in this section should not be taken literally and are mainly for relative comparisons. However, altering the values of the cold dust temperature and β within reasonable ranges does not affect our main conclusions. For example, a T c = 30 K for the subsolar-metallicity fit returns a T w = 150 ± 61 K and = f f 0.9 peak w peak c , and for the solar-metallicity model returns a T w = 53 ± 2 K and = f f 1.3 peak w peak c . As another example, keeping T c = 25 K but β = 2, the subsolar (solar)metallicity fit results in T w = 135 ± 10 K (49 ± 2) and = f f 0.8 peak w peak c (1.2). Therefore, the overall conclusion that the warm component is hotter in the subsolar-metallicity fit does not change by varying T c or β. Figure 5. Two-temperature modified blackbody (2T-MBB) IR fits, as described in Section A.2, to the stacks of 100, 160, 250, and 1200 μm data. First row shows the stacks of the subsolar-metallicity sample in the left panel, and the modeled solar-metallicity photometry based on the best-fit S18 template in Figure 3 in the right panel. Middle and bottom rows show the stacks in two bins of sSFR and SFR surface density (Σ SFR ), respectively. The light gray 24 μm stacked measurement is not included in the fits. The warm and cold components of the fits are also shown separately with the dotted orange and blue curves, respectively. Parameters of the warm and cold dust components are shown in the bottom of the plots: f f peak w peak c is the warm-to-cold component peak flux ratio, and T w and T c are the warm and cold dust component temperatures, respectively. The average metallicity, redshift, and other relevant properties of the samples are shown in the top-left corners. The most significant change in the width of the IR SED is between the two metallicity bins. and dispersion of the two subsamples in Table 2); therefore a change in the sSFR alone cannot explain the change in the IR SED of our two metallicity samples. Below, we explore possible causes for the elevated warm dust emission in the subsolar-metallicity sample in detail.

Possible Physical Causes for the Elevated Warm Dust Emission
Dust evolution is the combination of grain formation, processing (e.g., grain size modification, structural modification, coagulation), and destruction that can be affected by the incident nonionizing UV and ionizing radiation, cosmic rays, stellar ejecta, SNe shocks, and ISM elemental enrichment. As a result, the shape of the IR SED can be altered by both the grain properties (size and composition) and the grains' heating sources. The cold dust component resides in the diffuse ISM, is dominated by the emission from large grains, and constitutes most of the dust mass. On the other hand, the warm dust resides closer to star-forming regions or AGN. The hotter and broader IR emission seen in the subsolar-metallicity galaxies in this work may be the result of an overabundance of small dust grains (e.g., Galliano et al. 2018;Ysard et al. 2019), an intense interstellar radiation field (e.g., Dale et al. 2001;Draine & Li 2007;Galliano et al. 2011;Faisst et al. 2017), and/or an overabundance of silicate grains (De Rossi et al. 2018). These possibilities can be probed by examining the metallicity, sSFR, SFR surface density (Σ SFR ), age of galaxies, and AGN activity. As the galaxies in our sample show no evidence of obscured AGNs based on the examination of the IRAC colors (Coil et al. 2015;Azadi et al. 2017Azadi et al. , 2018Leung et al. 2019), we can assume the AGN contribution is negligible in the mid-to far-IR (for a discussion on AGN dust emission see, e.g., Kirkpatrick et al. 2012Kirkpatrick et al. , 2015Lyu & Rieke 2017. Below we review the other factors (metallicity, sSFR, Σ SFR , and age) that may be responsible for the broad IR SED shown in Section 3.
Metallicity. Metallicity is a tracer of dust processing, owing to grain growth by accretion of gas-phase metals (e.g., Hirashita 2015). However, the effect of elemental enrichment of the ISM on dust evolution becomes evident over long timescales of ∼1 Gyr (Galliano et al. 2018). Metallicity also traces the dust-to-gas ratio of a galaxy (Rémy-Ruyer et al. 2014;De Vis et al. 2019). At low metallicities, the lower dustto-gas ratio means the ISM is less dusty and more transparent, enabling the stellar radiation to heat a larger volume and deeper into the molecular cloud. Moreover, if the stellar and nebular metallicities correlate with each other (Cid Fernandes et al. 2005;Gallazzi et al. 2005;Bresolin et al. 2009;Toribio San Cipriano et al. 2017), a galaxy with a lower gas-phase metallicity may also have, on average, lower-metallicity stars that emit a harder ionizing spectrum. As a result, the harder and more intense radiation that affects larger volumes of the ISM can contribute to the hotter dust emission in low-metallicity galaxies. This effect is in addition to the lower abundances and hence lower dust attenuation that enables heating the dust deeper into molecular clouds, or the higher abundance of smaller (hotter) grains at low metallicities. 29 As demonstrated in Section 3.2, the stack photometry of subsolar-metallicity galaxies at z ∼ 2.3 follows very closely the average local low-metallicity template that we construct based on the IR photometry and spectra of higher-luminosity dwarf galaxies. This average template has a broader and warmer FIR emission compared to local solar-metallicity LIRGs. Half of the z ∼ 2.3 subsolar-metallicity sample has oxygen abundances higher than the upper limit of the dwarf sample used to build the local low-metallicity template ( + ( ) 12 log O H = 8.4), with an average of 0.1 dex higher oxygen abundance in the z ∼ 2.3 sample compared to the dwarf sample. If this difference is real, as the high-redshift oxygen abundances are subject to the assumed strong-line metallicity calibration (see discussion in Section 2.4.2), it indicates that galaxies at z ∼ 2.3 can have similar ionization field properties as local galaxies of lower metallicity. In fact, the rest-frame UV-optical studies have shown that the O/Fe ratio of z ∼ 2 galaxies is ∼0.5-0.6 dex enhanced relative to the solar abundance (Steidel et al. 2016;Topping et al. 2020bTopping et al. , 2020aCullen et al. 2021;Reddy et al. 2022). In other words, the O/H abundances of z ∼ 2 galaxies are higher than the O/H of z ∼ 0 galaxies at a given Fe/H. It is thus plausible that the Fe/H that controls the ionizing spectral production is similar between the z ∼ 0 dwarf galaxies and the z ∼ 2.3 galaxies in this study, despite their different O/H values. As a result, a similar ISM ionizing radiation field intensity and hardness causes a similar IR SED shape between the z ∼ 0 dwarfs and the z ∼ 2.3 subsolar-metallicity LIRGs.
It is also possible that the warmer IR SED of lowermetallicity galaxies originates from a geometry effect, such that the low-metallicity galaxies have a more clumpy ISM where dust is spatially concentrated and heated to higher temperatures. Tentative evidence for a clumpy dust geometry at low metallicities at z ∼ 2 has been discussed in Shivaei et al. (2020b). In that study, the authors compared the nebular and stellar dust reddening (E(B − V )) and concluded that on average at low metallicities the two reddenings are not the same, which suggests a clumpy two-component dust geometry.
We also discuss this possibility and its implications in Section 5.4.1.
sSFR. sSFR indicates the ratio of recent SFR to the SFR averaged during the lifetime of the galaxy. In studies of nearby galaxies it has been shown that the temperature of the warm dust correlates with both sSFR and metallicity (Galliano et al. 2005;Boselli et al. 2010;Smith et al. 2012;Rémy-Ruyer et al. 2013, 2015, and that at high SFRs it is driven by recently born young massive stars (Boquien et al. 2011). Rémy-Ruyer et al. (2015) attributed the warmer and wider SED of local dwarfs compared to the SED of local starbursts to differences in their sSFR. A high sSFR, indicates the galaxy is undergoing an active phase of star formation and is likely to have a clumpier ISM structure as it has a larger number of massive hot stars embedded in their dusty birthclouds (Dale et al. 2007;da Cunha et al. 2008). The clumpier ISM allows for a wider range of interstellar radiation field intensities, leading to a wider equilibrium temperature distribution of the dust grains (hence, broader IR SED), and the hotter regions shift the temperature to higher values (da Cunha et al. 2008;Rémy-Ruyer et al. 2015). sSFR also correlates with SFR surface density (Elbaz et al. 2011), which is a proxy for an intense radiation field, and hence a hotter dust temperature (see below). Moreover, a relatively higher supernovae rate in actively star-forming galaxies allows for an increase in the population of small dust grains through shattering of large grains by grain-grain collisions in supernovae shock waves (Jones et al. 1996). The sSFR distributions 29 In a turbulent ISM, grain shattering increases the abundance of small grains; however, its relative importance reduces as the metallicity decreases (Hirashita et al. 2008;Hirashita 2015). Therefore, at very low metallicities, it is expected that the initial grain size distribution is conserved. and −8.53, and standard deviation of σ = 0.21 and 0.21 for the subsolar-and solarmetallicity samples, respectively). Therefore, to assess the degree of variation in the IR SED with sSFR, we define two new bins of galaxies below and above = -( ) log sSFR yr 1 -8.45 with 15 and 12 galaxies, respectively, and fit their Herschel and ALMA data with the 2T-MBB models (middle panel of Figure 5). The difference in the warm dust temperature of the two sSFR SEDs is similar to that of the two metallicity SEDs (top panel of Figure 5). However, the SEDs of both the low-and high-sSFR stacks are still relatively broad, as the peak fluxes of the warm and cold components in both bins are similar to each other. As a comparison, the warm component in the solar-metallicity SED is 20 times more intense than the cold component, making the overall shape of the solar-metallicity IR SED narrower than that of the subsolar-metallicity one.
SFR surface density. The equilibrium temperature of dust grains can also be increased by high ISM radiation field intensity. A proxy for radiation field intensity is the SFR surface density, Σ SFR , indicating the compactness of the starforming region. The effect of Σ SFR on dust temperature is shown in previous observational studies (Lehnert & Heckman 1996;Chanial et al. 2007;Burnham et al. 2021), as well as the theoretical models of De Rossi et al. (2018). In the latter study, a "bluer" SED is produced by increasing star formation efficiency, which is accompanied by a decrease in the virial radius, and hence an increase in luminosity density. It has also been shown that the luminosity surface density explains the similar mid-IR emission behavior of centrally concentrated local ultraluminous infrared galaxies (ULIRGs) and that of high-redshift LIRGs (Elbaz et al. 2011;Rujopakarn et al. 2011Rujopakarn et al. , 2013. We use optical sizes, measured from HST/F160W images (van der Wel et al. 2014), to calculate Σ SFR . There are recent studies that show the dust emission tend to be much more compact than the stellar emission at these redshifts (Rujopakarn et al. 2019;Popping et al. 2022). However, as we use sizes to only separate the galaxies into two bins of Σ SFR , the discrepancy between the absolute optical (tracing mass) and IR (tracing star formation) sizes would likely not change any of the following main results.
The Σ SFR of galaxies in our sample (á S ( log SFR ñ = , from Tacconi et al. 2013). Within our sample, the subsolar-metallicity galaxies have on average a larger Σ SFR than the solar-metallicity galaxies. However, the correlation between Σ SFR and metallicity in the sample is weak (Pearson correlation coefficient of −0.44 with p-value of 0.02), and the standard deviation of Σ SFR within each metallicity subsample is large, making the Σ SFR distributions of the two metallicity bins not significantly distinct ( Table 2) , with 12 and 15 galaxies in the low-and high-Σ SFR bins, respectively. The fits are shown in the bottom panel of Figure 5. As expected, the best-fit model of the high-Σ SFR bin shows a warm component that is hotter than the one in the best-fit model of the low Σ SFR stacks. The warm dust temperature difference between the two bins is ΔT w = 30 ± 36, which is less significant than the warm dust temperature difference between the two metallicity bins. Equally important is the difference between the warm-to-cold peak flux ratios. Both the low-and high-Σ SFR best-fit models have warm and cold components with similar peak fluxes, indicating a broad IR SED that does not change with Σ SFR significantly.
Stellar population age. Another parameter that affects dust emission properties is age. De Rossi et al. (2018) showed that a silicate-rich mixture with amorphous carbon dust can explain a hot and broad IR SED that originates from the relatively high emission efficiency of silicates between ∼ 8 and 60 μm. A source of silicate-rich dust is massive AGB stars (with masses above 3.5 M e ; Ventura et al. 2012bVentura et al. , 2012a. A 3.5 M e star has a main-sequence lifetime of ∼400 Myr, suggesting that the silicate-rich dust composition would dominate the IR emission of galaxies younger than this age. We have age estimates for galaxies from the best-fit UV-to-near-IR SED models. However, as ages derived from exponentially rising star formation history models in SED fitting are ambiguous (e.g., see Section 6.2 of Reddy et al. 2012b), a robust analysis of the effect of age on the IR SED shape is beyond the scope of this paper. Additionally, the luminosity-weighted ages are correlated with metallicity and anticorrelated with sSFR, which makes it difficult to disentangle the age effect from the other two parameters in this analysis.
Summary. In conclusion, we find that a warm component is present across our sample,as seen in the wavelength separation of the warm and cold components' peaks in Figure 5 compared to that of the S18 model. Even in the low-sSFR and low-SFR surface density bins, the two peaks have wavelength separations of ∼265 ± 32 μm, which is wider than the wavelength separation between the cold and warm components modeled for the S18 template is -+ 213 12 14 μm. This could be due to the high sSFR and Σ SFR of these galaxies compared to the average values for their stellar mass at z ∼ 2 (Figure 1), which makes this analysis distinct from many other studies at z ∼ 2. For example, at z = 1.8 − 2.5, the S18 lowest-mass sample has = -* ( )  M M log 10.0 10.5, which is consistent with the stellar mass of our solar-metallicity sample (* ( )  M M log -10.2 10.6). The average SFR of the S18 sample based on their IR luminosity measurements and a Kennicutt & Evans (2012) relation (adopted for a Chabrier IMF) is ∼20 M e /yr. At these masses, it is expected to have >70% of the SFR in the obscured phase (Whitaker et al. 2017). Therefore, the -( ) log sSFR yr 1 of their low-mass sample is ∼−8.8 to −8.9, which is lower than the sSFR of the majority of our sample with the average and scatter of á ñ = --( ) log sSFR yr 8.4 1 and σ = 0.2 dex.
Across our sample, we find that the IR SED gets broader and the temperature of the warm component increases with decreasing metallicity, increasing sSFR, and increasing SFR surface density. The subsolar-metallicity, high-sSFR, and high- 317 27 55 μm, respectively. These are ∼ 1.5-1.6 times higher than the wavelength separation of 213 μm between the cold and warm components modeled for the best-fit S18 template. The broadening effect is the most significant with metallicity, as the two dust components show the largest peak separation and temperature difference ( l D = -+ 347 16 26 μm and ΔT = 108 ± 33 K) at subsolar metallicities.
The high-SFR surface density and subsolar-metallicity samples studied here are useful analogs to higher redshift galaxies, given the expected redshift evolution in size (Mosleh et al. 2012;van der Wel et al. 2014), SFR (Speagle et al. 2014;Tasca et al. 2015), and metallicity (Troncoso et al. 2014;Sanders et al. 2021) at a fixed stellar mass. As will be discussed in Section 5.2, the commonly used narrower and colder IR templates that are calibrated based on local LIRGs and ULIRGs, fit to shorter wavelength data alone (λ rest  60 μm), overestimate the RJ emission of high-redshift galaxies. In the case of limited available data, our results suggest that templates with hotter and broader IR SEDs should be considered for typical (i.e., LIRG and lower-luminosity) galaxies at z  2, such as the average local low-metallicity template presented in this work. Similar results have been shown previously by De Rossi et al. (2018) and Faisst et al. (2020), who recommended using the hotter and broader IR template of the local low-metallicity galaxies at z > 4.

Implications for Integrated Quantities: IR Luminosity, SFR, IR Colors, and Dust Mass
In the coming decade, the primary means to estimate IR luminosities, obscured SFRs, and dust masses will be JWST and millimeter/submillimeter facilities such as ALMA and LMT/TolTEC, operating respectively at ∼21 μm and ∼0.5 − 3 mm, which correspond to the rest-frame PAH emission and FIR/submillimeter dust continuum emission at z ∼ 2.3. Mainsequence typical galaxies similar to our subsolar-metallicity sample will be within reach for either approach and synergies between the observatories is a natural consequence. Since the results of this work indicate a change in the IR SED with metallicity, it is of concern how well either of the observed PAH or submillimeter dust continuum measurements estimate IR luminosities and how well the PAH emission can estimate the submillimeter fluxes and vice versa. The need for an improved prescription to predict the submillimeter flux of highredshift galaxies is underscored by the lower-than-expected detection rates of high-redshift galaxies in blind ALMA surveys (e.g., compare the predictions in Hatsukade et al. 2016 andFujimoto et al. (2016) with the detection rate in Aravena et al. 2016). In the following subsections, we first describe the comparison samples at z ∼ 0 and 2 (Section 5.1), and then use the results of the previous sections on the evolution of IR SED with metallicity to discuss submillimeter flux predictions and inferred IR luminosity (Section 5.2), obscured SFR (Section 5.3), and inferred dust masses (Section 5.4) of z ∼ 2.3 galaxies.
As the main results of this section and the next Section (5.4) rely primarily on ALMA photometry, we take advantage of the depth of our ALMA observations and construct stacks in more than two bins of metallicity. In this section, the uncertainties of the stacked photometry are only measurement uncertainties, and do not include the jackknife resampling as was performed in the previous sections. From a visual inspection, we exclude one of the 27 targets (ID 19753), which is likely a merger or a complex system with multiple star-forming components, from this part of the analysis. The 1.2 mm and UV peak emission of this object are not co-spatial, and while the MOSFIRE slit is centered on the component with the peak UV emission, the IR emission (traced by ALMA) is mainly from another component. Excluding this source does not change the results of the best-fit SED models in the previous sections.

Comparison Samples at z ∼ 0 and 2
Here we describe the comparison samples that are adopted from the literature. At z ∼ 0, we adopt four data sets with Herschel and Spitzer photometric observations that cover a wide range of galaxy populations in the local universe from dwarfs to ULIRGs, as follows.

The Key Insights on Nearby Galaxies: a Far-infrared
Survey with Herschel (KINGFISH; Kennicutt et al. 2011) is a survey of 61 nearby galaxies drawn from the Spitzer Infrared Nearby Galaxies Survey (SINGS), selected to span wide ranges in luminosity, optical-to-IR ratio, and morphology, with metallicities from + ( ) 12 log O H = 7.54 to 8.77. The metallicities and AGN classification are taken from Kennicutt et al. (2011). The Spitzer and Herschel photometries of the sample are listed in Dale et al. (2007) and Dale et al. (2012), respectively. To be consistent with the DGS measurements, we adopt the SFR, stellar mass, and dust mass of KINGFISH galaxies from Rémy-Ruyer et al. (2015).  Hughes et al. (2013), and O3N2 metallicities are calculated using the optical line catalog of Boselli et al. (2013). Dust masses (Ciesla et al. 2014) are derived from SED fitting. 4. To complete the sample of nearby galaxies, we also add data from Great Observatories All-sky LIRG Survey (GOALS; Armus et al. 2009), which has over 200 of the most luminous infrared-selected galaxies in the local universe. The Herschel photometry is listed in Chu et al. AGN are excluded from the KINGFISH sample ) and the HRS sample (based on optical lines; Hughes et al. 2013). The SFR and stellar masses of all the samples are either based on or converted to a Chabrier IMF.
As the main-sequence comparison sample at z ∼ 2, we adopt data from two surveys, as follows. Neither of the two samples have metallicity measurements.
1. We adopt 10 galaxies from the ALMA Spectroscopic Survey in the Hubble Ultra Deep Field (ASPECS; Walter et al. 2016) that are detected in ALMA band-6 continuum and Spitzer/MIPS 24 μm, and have redshifts of z = 1.6 − 2.2. The band-6 flux, stellar mass, SFR, and redshifts are taken from the survey's online public release 30 Boogaard et al. 2020;González-López et al. 2020). The Spitzer and Herschel fluxes are taken from the 3D-HST survey catalog (Skelton et al. 2014). Out of the 10 galaxies, only 4 are detected with PACS at 100 μm. There are no metallicity measurements for these galaxies. For consistency with our measurements, we calculate the dust masses from ALMA band-6 fluxes in the same way as done for our sample (Equation (3)) 2. The average SFR, stellar mass, and dust masses of Santini et al. (2014) at z = 2.0-2.5 are also adopted. Dust masses in this study are derived from SED fitting to the PACS and SPIRE 100-to-500 μm photometric stacks in bins of stellar mass and SFR. There are no metallicity measurements.

Estimating Submillimeter Fluxes and IR Luminosities
It is often the case at high redshifts that limited IR data are available to estimate IR luminosity or to predict flux densities of other parts of the IR SED. Therefore, for simplicity often a single locally calibrated IR template is assumed and fit to the limited data. Here, we investigate how well the IR luminosities and IR colors of our z ∼ 2.3 LIRG sample match with the commonly used IR templates of local LIRGs.
IR luminosity from ALMA. The IR luminosity (8 to 1000 μm) of our subsolar-metallicity bin is = ( ( ) )  L log L IR 11.35, calculated from the best-fit average local low-metallicity template in Figure 4. The S18 T = 35 and the R09 = ( ( ) )  L log L IR 11.25 templates fit to the ALMA submillimeter flux density alone underestimate the IR luminosity by 0.1-0.2 dex. The colder templates (e.g., the S18 T = 30 K or the R09 = ( ( ) )  L L log IR 11.00 templates) underestimate the IR luminosity by >0.4 dex. This is because these templates underestimate the elevated mid-IR emission in these galaxies. The IR luminosity of the low-metallicity z ∼ 2.3 galaxies is generally less biased when the templates are fit to PACS data, or when the broader dwarf templates are used to fit the MIPS data (e.g., see Appendix A in Shivaei et al. 2020a). The broader templates, such as the one we constructed here from the local low-metallicity galaxies, are necessary when only limited submillimeter flux observations are available.
ALMA flux from IR luminosity. From another point of view, we examine how well the submillimeter fluxes can be predicted based on the IR luminosity of our galaxies. The observed ALMA flux to L(IR) ratio of our subsolar-metallicity bin is . This observed value is lower than predictions from locally calibrated IR templates, such as those of R09 and S18. As an example, the R09 = ( ( ) )  L L log IR 11.25 and 11.50 templates have submillimeter flux-to-L(IR) ratios that are ∼0.2-0.3 dex larger than our observed value. The higher-than-observed submillimeter flux to L(IR) ratio of the templates is due to the colder dust temperature of the templates at a given IR luminosity compared to the luminosity-weighted average temperatures of the galaxies in this work. The hotter average temperatures can be attributed to the higher sSFRs of the galaxies compared to the main sequence. Similarly, the ASPECS galaxies that are located above the z ∼ 2 main-sequence relation (Shivaei et al. 2015b) show observed 1.2 mm flux-to-L(IR) ratios 31 that are lower than those predicted by the aforementioned local LIRG templates by >0.3 dex.
ALMA flux from mid-IR and PAH emission. Another common scenario for high-redshift surveys is that only photometry shortward of the IR emission peak is available (e.g., from Spitzer, Herschel, or future JWST/MIRI) to predict submillimeter fluxes. In this case, we assess how well the restframe 30-to-8 μm, 360-to-8 μm, and 360-to-30 μm IR colors of the z ∼ 2.3 galaxies match with those of the z ∼ 0 samples and local LIRG IR templates.
The z ∼ 2.3 sample in this work has relatively good constraints on the observed 24, 100, and 1200 μm photometry. These bands correspond to rest frames 8, 30, and 360 μm, respectively, which for a z ∼ 0 sample are roughly traced by IRAC 8 μm, MIPS 24 μm, and SPIRE 350 μm. In Figure 6, we show the IR colors based on the aforementioned bands for the z ∼ 2.3 sample and the z ∼ 0 comparison samples (Section 5.1). 32 Overplotted on Figure 6 are also the IR color predictions from four IR templates: (a) the R09 LIRG templates with = ( ( ) )  L log L IR 11.00, 11.25, 11.50, and 11.75 (dark red crosses with sizes increasing with increasing luminosity). These three models are in the range of recommended templates for galaxies at z > 1 by Rujopakarn et al. (2013) andDe Rossi et al. (2018), (b) the S18 templates with T = 35 and 45 K and PAH fraction of 0.01 and 0.1 (pink crosses with sizes increasing with increasing PAH fraction). The T = 35 template is the one that is recommended by Schreiber et al. (2018) to be used for high-redshift galaxies, (c) the starburst and mainsequence templates of (Elbaz et al. 2011; orange crosses with the small and large ones for the main-sequence and starburst models, respectively), and (d) the average local low-metallicity template (magenta plus sign, Section 3).
The rest-frame 30-to-8 μm colors of the z ∼ 2.3 galaxies vary by a factor of 5 between the lowest-metallicity bin ( + = ( ) 12 log O H 8.33

B18
) and the rest, which shows suppressed PAH emission at low metallicities, as expected. The 24 μm S/N of the lowest-metallicity bin is only 2.3, which makes its 30-to-8 μm color consistent with that predicted by the highest-luminosity R09 template and the hotter S18 template within 1σ. However, we know from previous studies with larger samples and better constraints on the PAH emission of z ∼ 2 galaxies, that the PAH emission at such metallicities at z ∼ 2 is suppressed compared to that of the local LIRG templates (Shivaei et al. 2017) and their IR SED shape resembles that of the local low-metallicity galaxies with low PAH fractions (Shivaei et al. 2020a). The IR colors of the lowest-metallicity stack in this work also overlap with those of some of the local dwarfs (with + ( ) 12 log O H ∼7.8), likely due to the low PAH emission and hot dust component of the 30 https://www.aspecs.info/data/ 31 Here, the L(IR) of ASPECS sample is estimated from UV-to-IR SED fitting (Boogaard et al. 2019). 32 To correct for the slight offset in rest-frame wavelength of the z ∼ 0 and 2 observations, we multiplied the z ∼ 0 MIPS 24 μm, and SPIRE 350 μm fluxes by factors of 2 and 0.9, respectively, to convert them to fluxes at rest frame 30 and 360 μm. These correction factors are estimated based on the local LIRG templates of R09. local dwarfs that resemble the dust emission characteristics of the z ∼ 2 galaxies with metallicity of ∼ 0.4 Z e .
In Figure 6, the models that are matched to the 30-to-8 μm color of the higher two metallicity stacks ( + 12 ) overpredict the rest-frame 360 μm flux by at least a factor of 2.5 (4σ) using the R09 and the T = 35 K S18 models, and even more if either of the Elbaz et al. (2011) templates are adopted. This effect can also be seen in Figure 7 where the models are fit to the observed 24 and 100 μm points alone, overpredicting the observed 1200 μm emission (except for the local low-metallicity template). The middle panel of Figure 6 indicates that the discrepancy between the observed and model predicted 350-to-30 μm colors increases with increasing sSFR, such that the higher-sSFR bin has observed colors about an order of magnitude lower than those predicted by the R09 and the T = 35 K S18 templates and that match better with the higher-temperature templates, while the lower-sSFR bin is discrepant with the two mentioned models only at a 2-3σ level. Metallicity and sSFR are highly correlated with each other, and separating their effects on the observed 360-to-30 μm color is not possible with the current data set.
The colors of the four ASPECS galaxies with detected observed 24, 100, 1200 μm data are in agreement with our two higher-metallicity stacks. The other six ASPECS galaxies are only detected in the observed 1200 and 24 μm bands and are shown by horizontal lines (i.e., no constraints on the PACS photometry)-again in agreement with the rest-frame 360-to-8 μm colors of our two higher-metallicity stacks. These results indicate that, irrespective of the PACS observations, the locally calibrated LIRG templates anchored to the rest-frame 8 μm flux density may overestimate the submillimeter flux of typical galaxies at z ∼ 2.3 by ∼1.5 up to an order of magnitude.
In summary, using the local LIRG templates and the restframe 8 μm emission alone or the combination of the 8 μm and ∼30 μm emission tends to overestimate the submillimeter fluxes of the high-redshift LIRGs. This effect becomes more pronounced with increasing sSFR. Hotter templates (e.g., those 100 μm stacks alone (blue curves) compared with the average local low-metallicity template fit all data (gray curve). The R09 and S18 templates fit the 24 and 100 μm photometry alone overestimate the observed submillimeter flux and the IR luminosity calculated from the local low-metallicity fit the 24-to-1200 μm data. Figure 6. IR colors from rest-frame 360, 30, and 8 μm emission for the z ∼ 2.3 stacks in this work (large stars) compared to those of z ∼ 0 surveys and the z ∼ 2 ASPECS galaxies. Comparison samples are described in Section 5.1. Crosses show model predictions: The magenta plus sign is the average local low-metallicity template, and the small and large orange crosses (connected with a dotted line) show the main-sequence and starburst models of Elbaz et al. (2011), respectively. Dark red crosses (connected with a solid line) show the R09 models with IR luminosities of 10 11 , 10 11.25 , 10 11.5 , and 10 11.75 L e , in order of increasing size. Pink crosses (connected with dotted-dashed lines) show the S18 models for two temperatures (indicated on the plots) and two PAH fractions (0.01 and 0.1 for the small and large cross, respectively). Top: 360-to-30 μm vs. 30-to-8 μm flux ratios, color coded by metallicity. The stars with black edges are stacks in three bins of metallicity, while the gray edge stars are two stacks in bins of sSFR, shown in the middle panel. Middle: 360-to-30 μm vs. 30-to-8 μm flux ratios, color coded by sSFR. Stars with black edges are the stacks in two bins of sSFR. The stacks in three metallicity bins from the top panel are shown by stars with gray edges. Bottom: 360-to-8 μm vs. 30-to-8 μm flux ratios, color coded by metallicity. Horizontal black lines show the 360-to-8 μm color of the ASPECS z ∼ 2 galaxies that do not have PACS 100 μm (rest frame 30 μm) detections. with T > 40 K from the S18 library or the R09 templates with > ( ( )  log L IR L 11.75 R09) predict more accurate submillimeter fluxes based on shorter wavelength data. Similar conclusion holds for the local low-metallicity template derived in this work (Section 3), where the submillimeter fluxes of z ∼ 2.3 LIRGs are well predicted based on 8 or 30 μm emission. However, all templates predict lower 30-to-8 μm colors than the lowest-metallicity z ∼ 2.3 galaxies ( + ( ) 12 log O H ∼ 8.3). The IR colors of these low-metallicity z ∼ 2.3 galaxies are similar to those of some of the lowestmetallicity local dwarfs albeit their 0.2-0.6 dex higher oxygen abundances, suggesting very weak PAH emission at gas metallicities of + ( ) 12 log O H ∼ 8.3 at z ∼ 2.3 (similar to the findings of Shivaei et al. 2017).

Obscured SFR Fraction
We estimate the total SFR of the sample from dust-corrected Hα luminosity using the Balmer decrement (Section 2.4.1; Table 2). As it has been shown elsewhere (Shivaei et al. 2016), this procedure is a good estimator of the total SFR in galaxies that are not heavily dust-obscured, such as those in this study (see sample characteristics in Section 2.1). Comparing the total SFR with the obscured SFR derived from L(IR) (using the calibrations of Kennicutt & Evans 2012) demonstrates that there is a significant fraction of unobscured star formation that does not contribute to the IR emission in both metallicity bins. The ratios of SFR(IR) to dust-corrected SFR(Hα) for the subsolar-and solar-metallicity stacks in this work are 59% and 57%, respectively. Previous studies have shown that the obscured fraction of star formation (the ratio of SFR(IR) to bolometric SFR) decreases with decreasing mass (e.g., Reddy et al. 2010;Whitaker et al. 2017); however the ∼60% obscured fraction of the our sample, with average stellar mass of = * ( )  M M log 10.18 and 10.43 in the subsolar-and solarmetallicity bins, is lower than that found previously. For example, Whitaker et al. (2017) predict an obscured fraction of 86 ± 2% at = * ( )  M M log 10.18 from their average fit to the data. However, the tail of their obscured fraction distribution at these masses extend to ∼50%. The discrepancy with the average prediction may be due to a bias toward less obscured star-forming galaxies in our sample, as the sample is selected to have significant detection in optical nebular emission lines. However, the Whitaker et al. (2017) sample is also selected based on rest-frame optical continuum emission, and presumably biased against heavily obscured systems. The discrepancy may also originate from the way IR luminosity is estimated in Whitaker et al. (2017). In that work, the authors convert 24 μm fluxes to IR luminosity from a single log average of the Dale & Helou (2002) templates. The use of a single template over all luminosities is an oversimplification. For example, using a single IR template of R09 (e.g., the = ( ( ) )  L log L IR 11.25 or 11.50 template) to convert the 24 μm flux to IR luminosity, overestimates the IR luminosity by a factor of 2 because the FIR/submillimeter emission predicted by these templates is higher than the observations (Figure 7). If the unobscured SFR is about half of the obscured SFR (i.e., the obscured fraction is 0.67), then a factor of 2 overestimation in the obscured SFR results in an obscured fraction of ∼0.8, which is similar to the discrepancy we see between the low-metallicity obscured fraction and that predicted by Whitaker et al. (2017). These results indicate that the unobscured SFR may be more significant in the subsolar-metallicity and/or high-sSFR galaxies at z ∼ 2.3 than has been previously assumed.

Dust Mass
The bulk of dust mass (M dust ) in galaxies is from the cold dust population that dominantly emit at FIR/submillimeter wavelengths, making the RJ emission a good diagnostic for dust masses. Following the discussion in Scoville et al. (2016), we derive dust masses from the ALMA 1.2 mm flux densities, assuming an optically thin MBB, as where S ν is the 1.2 mm flux density (in the observed frame), f CMB is the correction factor for the cosmic microwave background (CMB) effect on the background at the redshift of the targets (Equation (18)  , where κ 0 is the opacity at ν 0 and β is the submillimeter emissivity index. The main assumptions that enter this calculation and contribute to the uncertainties on dust masses are the temperature of the cold component and the submillimeter emissivity and opacity parameters. For the cold dust temperature, we perform a Monte Carlo simulation by drawing dust temperatures from a Gaussian distribution with a mean of 25 K and σ = 5 K. We assume an emissivity index of β = 1.5, as our subsolar-metallicity data suggest a β shallower than 2 (wider IR peak), in agreement with previous studies (see Appendix B). Based on the choice of β, an opacity of κ 0 = 0.232 m 2 kg −1 at 250 μm is adopted (Draine 2003;Bianchi 2013). The κ 0 value is the most uncertain factor in dust mass estimations. Different assumptions of κ 0 from the literature affect dust mass estimations by up to a factor of 3. In Appendix B, we discuss our assumptions on dust temperature, β, and κ 0 and their systematic uncertainties in detail. Figure 8 shows the individual dust mass measurements and the stacked values in bins of stellar mass and metallicity of the galaxies in this work, compared with other samples at z ∼ 0 to 2 (for the description of the comparison samples refer to Section 5.1). The z ∼ 2 samples show consistent dust masses where they overlap in stellar mass. At a given stellar mass, dust mass increases from z ∼ 0 to 2 by about a factor of 10. An increase in dust mass at a given stellar mass has been previously seen at higher stellar masses (Santini et al. 2014;Kirkpatrick et al. 2017), and is expected owing to the increase in gas-to-stellar mass ratio from z ∼ 0 to 3 (Schinnerer et al. 2016;Tacconi et al. 2018;Decarli et al. 2020). We also see higher dust masses at a given metallicity at z ∼ 2.3 compared to z ∼ 0 in the right panel of Figure 8. If the PP04 metallicity scale is used instead of the B18 scale for the z ∼ 2 sample, the 33 At the dust temperatures and redshifts of the galaxies under study the additional dust heating by the CMB is negligible, but due to the effect of the CMB background at the observed wavelength, the observed flux against the CMB is ∼80% of the intrinsic flux (see Figure 3 in da Cunha et al. 2013). redshift evolution of dust mass at a fixed O/H would be even larger. Therefore, the systematic uncertainties associated with the metallicity estimates at high redshifts are unlikely to artificially induce the observed trend in dust mass evolution.

Dust Mass Evolution
To better understand the redshift evolution of dust mass fractions, we plot dust-to-stellar mass ratio (D/M * ) and dust mass to SFR ratio (D/SFR) as a function of the metallicity in Figure 9. As also seen in the z ∼ 0 sample (Rémy-Ruyer et al. 2015) and the simulations (Popping et al. 2017), there is no statistically strong correlation between the D/M * and metallicity at z ∼ 2. However, there is about an order of magnitude increase in D/M * from z ∼ 0 to 2 across all metallicities that are covered in this work. In Section 5.4.2, we estimate the gas masses from star formation law scaling relations to compare with our dust mass measurements and discuss the possible explanation and implications of the dust mass evolution.
The galaxy formation semianalytical models of Popping et al. (2017) at z = 0 and 2 are shown in Figure 8. These models predict less dust mass at z ∼ 2 compared to the observations, particularly at lower stellar mass and metallicities. This discrepancy originates from an underproduction of molecular gas masses of the z ∼ 2 main-sequence galaxies in the models compared to observations (Somerville & Davé 2015;Popping et al. 2019). Given that in the Popping et al. (2017) model the dust growth mechanism at these metallicities is dominated by the accretion of metals onto grains in the dense ISM, an underestimation of molecular gas mass compared to the observations also results in a lower than observed dust masses. The discrepancy is most pronounced at lower metallicities in the right panel of Figure 8, which stems from a known outcome of the model that dust-to-metal ratios are lower than the observations at + ( ) 12 log O H  8.5 (G. Popping & C. Péroux 2022, in preparation).
In Figure 9, the Popping et al. (2017) semianalytical models are in good agreement with D/SFR-metallicity relation at z ∼ 2, even though they underestimate dust masses at lower metallicities in Figure 8. In these models, the main mechanism of dust formation at these metallicities is through dust growth in the ISM, which is a strong function of molecular hydrogen surface density and metallicity. On the other hand, SFR is also regulated by H 2 mass through empirical relations. Therefore, although the H 2 gas mass predicted by these models is less than that observed at z ∼ 2, the ratio of dust mass to SFR stays in agreement with observations. In other words, in the models, dust-to-molecular-gas mass and SFR-to-molecular-gas mass relations with the metallicity are in agreement with observations; yet the dust mass and SFR are lower than those observed at these metallicities.
An increase in dust masses from z = 0 to 2 at a given stellar mass is in tension with the apparent lack of redshift evolution in the obscured SFR fraction and IR-to-UV-luminosity ratio (IRX) or UV obscuration (A 1600 ) at a fixed stellar mass over the same redshift range (Bouwens et al. 2016;Whitaker et al. 2017;Reddy et al. 2018a;Shapley et al. 2022, but Shivaei et al. 2020a found an evolution in IRX at a given UV slope with the mass and metallicity; see below). One possible explanation is a change in the dust-star geometry of main-sequence galaxies at z ∼ 2 compared to the local galaxies. The bulk of dust mass is determined by the cold dust population, while the IR emission is dominated by the emission from warmer dust grains in the actively star-forming regions. A two disjoint dust component geometry, where one component is the hot dust in the birth clouds of recently formed stars and the other is the colder dust that resides in the diffuse ISM (Charlot & Fall 2000), would reconcile the high D/M * of z ∼ 2 galaxies with the lack of evolution in their obscured SFR fraction (and IRX) versus stellar mass relation. There is observational evidence for the emergence of such a two-component dust geometry in subsolar-metallicity galaxies at z ∼ 2 from a comparison of the dust reddening of ionized nebular gas and that of stellar continuum (Figure 8 in Shivaei et al. 2020b). A higher average nebular dust reddening compared to the stellar reddening for subsolar-metallicity z ∼ 2 galaxies suggests two different dust populations affect the emission from massive young stars and older stars. Spatially resolved observations (on the scales of a few tens of pc) are required to verify this physical picture. Different dust characteristics (composition and size) with different radiation efficiencies may also play a role. For  Table 3). Samples from the left panel that do not have metallicity measurements are not shown in the right panel. High-redshift galaxies are on the metallicity scale of Bian et al. (2018). Semianalytical models of Popping et al. (2017) at z = 0 and 2 are shown in both panels. At a given stellar mass or metallicity, the z ∼ 2.3 data show about an order of magnitude higher dust mass compared to z ∼ 0. example, Shivaei et al. (2020a) showed that at a given UV continuum slope β (i.e., stellar continuum reddening, not to be confused with the β that characterizes submillimeter dust emissivity), the lower-metallicity galaxies have lower IRX compared to the higher-metallicity ones, which may indicate different dust characteristics in low-metallicity galaxies. Other studies have shown similar results that younger and lower-mass galaxies at z ∼ 2 have a lower IRX than high-mass galaxies at a fixed UV slope (e.g., Reddy et al. 2012aReddy et al. , 2018aFudamoto et al. 2020). In this picture, when comparing the z ∼ 0 and 2 galaxies at a given stellar mass, the z ∼ 2 galaxies have lower metallicities, and although they have higher dust masses, dominated by large grains emitting at FIR/submillimeter wavelengths, their light-weighted IR luminosity is not proportionally higher as it is dominated by the reemitted light from the smaller grains whose characteristics have changed. Another important factor that might have contributed to the lack of a redshift evolution in the IRX (or obscured SFR)stellar mass relations in the literature is the uncertainties in measuring L(IR) and the obscured SFR of low-mass/lowmetallicity galaxies at z ∼ 2. For example, in the presence of a significant warm dust component, a cold-dust IR template tends to underestimate the IR luminosity (as discussed in Section 5). On the other hand, obscured SFRs derived from PAH emission could be underestimated if the reduced intensity of PAH emission relative to L(IR) at low metallicities is not considered (Figure 7 in Shivaei et al. 2017).

Gas Mass Estimates
Gas masses from star formation law scaling relations. We first estimate the gas masses from the scaling relations of Tacconi et al. (2018), independent of our dust mass measurements (Table 3). Tacconi et al. (2018) used a combination of CO observations, dust SEDs, and MBB fits to the RJ tail to estimate molecular gas masses. The gas masses of galaxies with M * < 10 10.5 M e at z > 1 in that work are dominantly from either of the latter two methods (no CO observations). For those galaxies, the molecular gas mass is derived from the dust mass, assuming a linear relation between dust to molecular gas ratio and metallicity, 34 where metallicities are estimated from a mass-metallicity scaling relation. The Tacconi et al. (2018) scaling relations at higher metallicities (higher stellar masses) are based on direct CO observations and not dust mass estimates from IR/submillimeter data. These estimates are subject to uncertainties in CO-to-H 2 conversions (Bolatto et al. Figure 9. Top: dust-to-stellar mass ratio as a function of metallicity for individual galaxies in this work and their stacked values in three bins of metallicity. Symbols are the same as in Figure 8. AGN are excluded from the KINGFISH sample ) and the HRS sample . Bottom: dust mass to SFR as a function of metallicity for the same objects as in the top panel. Symbols are color coded by the ratio of their SFR to the main-sequence SFR at the given mass. For the z ∼ 2.3 sample, the main-sequence relation of Shivaei et al. (2015b) is adopted (modified for the same IMF as used in this work) and for the z ∼ 0 objects, the main-sequence relation of Salim et al. (2007) is adopted. Semianalytical models of Popping et al. (2017) at z = 0 and 2 are shown in both panels. 2013). A more detailed discussion on the CO conversion uncertainties is beyond the scope of this work and will be addressed in future work where direct CO observations are available.
D/M * is proportional to dust-to-gas mass ratio (D/G) multiplied by gas-to-stellar mass ratio (G/M * ). Using the scaling relations of Tacconi et al. (2018) between the molecular gas mass, M * , and SFR over a redshift range of z = 0-4, we estimate the molecular gas masses of our galaxies based on their redshift, offset from the main sequence, 35 and stellar mass. For consistency, we also calculate molecular gas masses in the same manner for a subset of the DGS, KINGFISH, and HRS samples that have similar average metallicities as our three metallicity bins ( + ( ) 12 log O H ∼ 8. 3, 8.5, and 8.7, for the DGS, KINGFISH, and HRS subsamples, respectively).
Assuming no redshift evolution in D/G at a given metallicity between z ∼ 2.3 to 0 (shown at solar metallicities in Shapley et al. 2020 andG. Popping et al. 2022, in preparation) and using the estimated molecular G/M * ratios from the Tacconi et al. (2018) relations as described, D/M * is expected to be ∼ 20-50 times higher in the z ∼ 2.3 sample compared to the z ∼ 0 comparison samples. However, our observations show a factor of ∼ 2-10 smaller change in D/M * at a given metallicity. This implies that dust to molecular gas mass is lower at z ∼ 2.3 compared to that at z ∼ 0 by a factor of ∼2-10.
A drastic change in D/G is anticipated for galaxies with different star formation efficiencies (SFE; SFR per unit molecular gas mass), when the metallicities are close to a "critical" metallicity defined by Asano et al. (2013). Critical metallicity is the metallicity at which dust mass growth in the ISM becomes equal to dust production from stars (AGB stars and SNe). According to the theoretical dust evolution models of Asano et al. (2013), the critical metallicity itself depends on SFE, and becomes larger with shorter star formation timescales (τ SF , defined as gas mass to SFR). Based on their simulations, at + ( ) 12 log O H = 8.4, the DtG of a model with τ SF = 0.5 Gyr is 13 times lower than that of a model with τ SF = 5 Gyr. In the bottom panel of Figure 9, the clear trend of decreasing D/SFR at a given metallicity with increasing SFR offset from the main sequence (SFR/SFR MS ; color-coding in Figure 9, bottom panel) is a result of increasing SFE-both with increasing offset from the main sequence and with increasing redshift. In summary, the higher D/M * at z ∼ 2.3 compared to z ∼ 0 at a given metallicity in Figure 9 can be explained by the higher molecular gas fractions along with the lower D/G (due to the higher SFEs) of the z ∼ 2.3 galaxies.
As a comparison, Table 3 also shows gas masses derived from the Schmidt-Kennicutt star formation law adopted from Kennicutt & De Los Reyes (2021). The single power-law relation in Kennicutt & De Los Reyes (2021; with slope n=1.5) represents the overall relation for the nearby starbursts and nonstarbursting disk galaxies. However, the resolved Schmidt-Kennicutt relation has not been tested for main-sequence galaxies at higher redshifts. There are resolved CO and dust observations of z ∼ 1-3 galaxies that indicate a different spatial distribution of molecular gas and dust compared to the stellar emission (e.g., Calistro Rivera et al. 2018;Kaasinen et al. 2020). These observations are typically limited to more massive and/or more actively star-forming galaxies than those in this work. At last, whether the F160W (rest-frame optical) sizes that we adopt here are representative of where recent star formation activity occurs, is another source of uncertainty in the gas masses derived from the Schmidt-Kennicutt relation in Table 3. Resolved CO and dust observations of larger samples of main-sequence galaxies at z > 1 would help to shed light on these uncertainties.
Gas masses from D/G-metallicity relations. Using the metallicity of our galaxies, we also calculate the expected D/ G ratios based on the D/G-metallicity relations of Leroy et al. (2011), Rémy-Ruyer et al. (2014), and De Vis et al. (2019 in Table 3. The Leroy et al. (2011) relation between molecular gas and dust mass is derived based on the resolved CO, H I, and IR maps of five nearby galaxies that span an order of magnitude in metallicity from  . The errors are propagated measurement errors of Σ SFR and radius and the uncertainty on the star formation law coefficient (1.5 ± 0.05), as reported in Kennicutt & De Los Reyes (2021). based on 466 local galaxies from the DustPedia project 36 with metallicities from optical lines, H I observations (H 2 is calculated from a scaling relation between the H 2 -to-H I ratio and the H I-to-stellar mass ratio), and IR data.
In Table 3 (2014). Such discrepancies underscores the importance of CO observations of subsolarmetallicity galaxies at z > 1 to robustly constrain their dust to molecular gas ratios, even though the estimates will be subjected to the CO-to-H 2 conversion uncertainties (Bolatto et al. 2013).

D/M * versus sSFR (Dust Formation Rate Diagram)
Several studies have attempted to model D/M * evolution in a galaxy (e.g., Asano et al. 2013;Nanni et al. 2020) and its evolution with the redshift (e.g., Calura et al. 2017;Popping et al. 2017). To track the time evolution of dust within a galaxy, we look at the D/M * versus sSFR diagram (or dust formation rate diagram), as shown in Figure 10. We show dust evolutionary tracks from two different models, those of Asano et al. (2013) and Nanni et al. (2020). The models assume various dust production (stellar origin and dust mass growth in the ISM) and destruction (SN shocks) channels. In the Asano et al. (2013) models, dust production at low metallicity and high sSFR is dominated by stellar sources. Once the galaxy reaches its critical metallicity in the ISM, dust mass growth by metal accretion on the dust grains in the ISM dominates the dust production. At that point, dust mass increases more rapidly than the rate of star formation (hence the steep rise in the DtS). After that, when the majority of metals is locked up in grains, dust mass growth saturates but the stellar mass build up continues, resulting in a decrease in sSFR and DtS. Figure 9 shows that our data, with dust and metallicity measurements, occupies a distinct parameter space with higher sSFR and higher DtS compared to the majority of the z ∼ 0 population. The z ∼ 2 data of Santini et al. (2014) and the ASPECS program overlap with our measurements (but without metallicity constraints). One can argue that an Asano et al. (2013) model with a lower depletion time than that assumed in their work (τ = 500 Myr shown by the blue solid curve is the model with the shortest timescale in that work) may cover the z ∼ 2 parameter space. However, typical depletion timescales for these galaxies are not expected to be much shorter than a couple of 100 Myr (e.g., Tacconi et al. 2018;Boogaard et al. 2019).
The z ∼ 2.3 data can also be explained by the Nanni et al. (2020) models that have outflows and a higher SN condensation efficiency compared to the Asano et al. (2013) models. Nanni et al. (2020) assumed a top-heavy IMF as their fiducial model. A typical IMF (e.g., a Chabrier IMF) would shift the curves downward in this diagram. Dissecting the exact theoretical dust evolutionary model that best describes our data set is beyond the scope of this paper. However, it is clear that given the large number of uncertain parameters in the theoretical models, the combination of dust mass, metallicity, and sSFR at z > 0 opens a new parameter space to constrain the theoretical models with observational evidence.

Summary
In this work, we present band-6 (1.2 mm) ALMA continuum observations of a sample of 27 star-forming galaxies at z = 2.1-2.5, with robust metallicity and SFR measurements from rest-frame optical emission lines (Hβ, [O III]λ5008, Hα, [N II]λ6585), carefully selected to represent a wide range in metallicity, + ( ) 12 log O H B18 ∼ 8.1-8.8, and stellar mass, -* ( )  M M log 9.7 10.6. The galaxies are selected from the rest-frame optical spectroscopic MOSDEF survey and are located in the COSMOS field with a wealth of photometric data covering rest-frame UV to FIR.
Using the Spitzer, Herschel, and ALMA data, we constrain the IR SED and dust masses of the z ∼ 2.3 subsolar-metallicity galaxies and compare them with those of local galaxies spanning a wide range in galaxy populations from lowmetallicity dwarfs to starbursts. The low metallicity and high sSFR of the z ∼ 2.3 galaxies in this work make them ideal analogs for typical star-forming galaxies in the early universe. The main conclusions of the paper are as follows.
We find that the average IR SED of the z ∼ 2.3 subsolarmetallicity galaxies (á + ñ= ( ) 12 log O H 8.4

B18
) has an additional warm dust component (with peak temperature of about 100 K) that broadens the IR SED, a behavior similar to that of the local dwarf galaxies but different from the commonly adopted LIRG and ULIRG templates (Section 3 and Figure 4). The width of the IR SED of the z ∼ 2.3 solar-metallicity galaxies is similar to that of the local LIRGs, as expected (Section 3 and Figure 3). The IR SED also becomes broader and warmer with increasing sSFR, and increasing SFR surface density (Section 4.2 and Figure 5). However, the broadening effect is the most significant when the sample is divided by gas metallicity, such that the ∼ 30−80 μm IR SED changes from a dominant single warm MBB component with T ∼ 50 K at solar metallicity to two distinct components with T ∼ 25 and 130 K at subsolar metallicity (Section 4.1 and Figure 5).
The warm and broad IR SED of the z ∼ 2.3 subsolarmetallicity galaxies can be attributed to a more transparent ISM as well as a harder ionizing radiation that increase the dust temperature to higher values and in larger volumes, similar to that of the local dwarf galaxies with ∼0.1-0.6 dex lower oxygen abundances (O/H). It is also partly due to the higher star formation surface density and sSFR of the z ∼ 2.3 galaxies compared to the z ∼ 0 ones at a given stellar mass, which result in a wider range of interstellar radiation field intensities that dust grains are exposed to, and a higher relative abundance of hot grains in the vicinity of H II regions, emitting at shorter wavelengths. A more clumpy dust geometry of the lowmetallicity galaxies may also contribute to the broadening effect (Section 4.2).
The presence of a significant warm dust component in the subsolar-metallicity and high-sSFR galaxies has important implications for deriving IR luminosity and obscured SFR of high-redshift galaxies, as well as for predicting submillimeter fluxes with limited optical to mid-IR data. We show that the locally calibrated LIRG templates overestimate submillimeter ALMA fluxes of z ∼ 2.3 LIRGs by a factor of ∼1.5 up to an order of magnitude, if anchored to the PAH emission and/or mid-IR continuum emission (rest frame ∼30 μm; Figure 6 and 7). The overestimation increases with increasing sSFR. Adopting a broader SED with a prominent warm dust component, similar to those of local dwarf galaxies, alleviate the overprediction of the submillimeter fluxes. On the other hand, the IR luminosity can be underestimated by >0.4 dex if the cold LIRG templates are used and scaled to a single submillimeter ALMA continuum flux. This is often the case at z > 3, where only a single FIR data point is available to estimate IR luminosities. As such galaxies have relatively lower metallicities and are often selected to have vigorous star formations, we recommend to adopt warmer and broader templates to mitigate the underestimation of IR luminosity (Section 5.2). The average local low-metallicity template constructed in this work (Section 3.2) is publicly available. 37 Using the best-fit IR models with the ALMA constraint, we estimate the total L(IR) and obscured SFR for the sample. Comparing the obscured SFR to total SFR derived from optical emission lines, we find that the obscured SFR of the subsolarmetallicity galaxies with average stellar mass of = ( )  log M M 10.18 is ∼ 60% of the total SFR. This is 1.5 times lower than the average value indicated in previous studies for the general population of galaxies at this stellar mass (Section 5.3).
We find that dust masses (derived from ALMA 1.2 mm data) are about an order of magnitude higher at z ∼ 2.3 compared to z ∼ 0 at a given stellar mass and metallicity. An order of magnitude higher dust-to-stellar mass implies an order of magnitude higher gas-to-stellar mass fractions at z ∼ 2.3 compared to z ∼ 0, if dust-to-gas mass ratio stays constant. However, CO-based studies (Tacconi et al. 2018) predict a larger dust-to-stellar mass evolution for the star-forming galaxies in this work, which suggests that dust in these galaxies is depleted (i.e., lower dust to molecular gas mass ratios) by a factor of >2. A lower dust-to-gas ratio can be a result of a high star formation efficiency in the galaxies in this sample (Section 5.4).
The higher dust-to-stellar mass ratios of z ∼ 2.3 galaxies compared to z ∼ 0 is potentially in tension with the apparent lack of redshift evolution in the obscured luminosity (or SFR) fraction at a given stellar mass. Given that the bulk of dust mass is from the cold dust population, while the IR emission is dominated by the emission from warmer dust in the vicinity of actively star-forming regions, this finding hints at different dust-star geometry in z ∼ 2.3 main-sequence galaxies compared to that of the local galaxies (Section 5.4).
Many of the existing empirical IR SED templates and gas mass calibrations at z > 1 are based on high-mass, highmetallicity galaxies. This study pushes into the important territory of lower metallicities and finds key differences in the dust properties of subsolar-metallicity galaxies at z ∼ 2.3 compared to their more metal-rich counterparts. These differences are crucial to be considered for future observations with IR facilities and motivate further detailed studies of gas and dust in the lower-metallicity regime at high redshifts. For instance, better sampling of the FIR/submillimeter emission of galaxies with ALMA and LMT/TolTEC, over a larger wavelength range, would help to set observational constraints on the cold dust temperature and dust emissivity index of subsolar-metallicity galaxies, which are not well constrained by the single ALMA band observations presented in this work. Moreover, the relative contribution of PAHs and very small grains to the mid-IR spectra remains unconstrained with only a single photometric data point from Spitzer/MIPS 24 μm. Future studies with JWST/MIRI will be able to break this A.3. The L16 Low-metallicity Templates To complete the set of IR templates, we also use the IR SED library of 19 local low-metallicity galaxies from (Lyu et al. 2016, L16). The data are from the DGS survey . L16 selected 19 DGS galaxies that had high enough S/ N in Spitzer/IRS spectra and constructed the templates by combining the IRS spectra with Herschel and WISE photometry ( Table 7 in L16). The main difference between their SEDs and those presented in Rémy-Ruyer et al. (2015) is that L16 adopted the power law+MBB fitting procedure of C12, while Rémy-Ruyer et al. (2015) used the semiempirical dust SED models of Galliano et al. (2011). The advantage of the C12 model in this case is that, owing to the well-sampled mid-IR emission of these 19 galaxies, L16 were able to relax the bounding condition of the power-law turnover wavelength, λ c (see the discussion in Section 3.2). As a result, they were able to successfully capture the flux density excess at λ  50 μm with their fits, while Rémy-Ruyer et al. (2015) had to invoke a second MBB with a hotter temperature to fit the flux excess.

Appendix B Derivation of Dust Masses
To derive dust masses from RJ emission based on Equation (3), we need to assume a cold dust temperature, the opacity at a reference wavelength κ 0 , and a submillimeter emissivity index β. Here, we discuss each of these assumptions in our calculations and their associated systematic uncertainties.
Cold dust temperature. For the cold dust temperature, we perform a Monte Carlo simulation by drawing dust temperatures from a Gaussian distribution with a mean of 25 K and σ = 5 K. This assumption takes into account the scatter in mass-weighted dust temperatures based on the results of previous studies with better submillimeter wavelength coverage both at z ∼ 0 down to low metallicities and for massive galaxies at higher redshifts (Cortese et al. 2014;Kirkpatrick et al. 2015;Rémy-Ruyer et al. 2015;Scoville et al. 2016;Faisst et al. 2017). These studies show that the mass-weighted temperature varies less than the luminosity-weighted dust temperature derived from shorter wavelength data. As the mass-weighted dust temperature is the relevant quantity to derive dust masses from RJ emission, we use T = 25 ± 5 K in Equation (3) and do not consider the hotter dust component that appears in Figure 5. Dust mass inferred from the RJ emission is only linearly dependent on dust temperature (M dust ∝ 1/T), and hence the relative contribution of a possible warm component dominating the FIR luminosity would not be significant. For example, in the top-right panel of Figure 5, in which the warm component has the most significant flux contribution to the RJ emission compared to the other bins, if we assume half of the 1.2 mm flux originates from the warm component with T = 54 K, the dust mass would be lower by only 35%, which does not change any of the main conclusions in this paper.
Emissivity index β. For β, we take the average of our best fits to the solar and subsolar-metallicity stacks from the earlier sections in the paper, β = 1.5. An emissivity index of 1.5-2.0 has been previously observed in moderately lower-metallicity local galaxies, for example in the outer disk of M33 (Tabatabaei et al. 2014), in the Magellanic Clouds (Planck Collaboration et al. 2011), in local galaxies (Cortese et al. 2014;Faisst et al. 2017), and in a few higher-redshift galaxies (Faisst et al. 2020). Our fits to the subsolar-metallicity stacks strongly prefer an even lower β = 1. The emissivity index derived from SED fitting is in fact an effective emissivity index, which can be different from the intrinsic emissivity of the cold grains (Dunne et al. 2000;Galliano et al. 2011). If there is a distribution of temperatures (which is realistically always the case for integrated observations of high-redshift galaxies), the SED will be broadened and the effective β will be lower. In our case of subsolar-metallicity observations, because of the presence of a significant warm component, if we use the flexible models of Casey (2012), the best-fit model with the least χ 2 is the one with lowest allowed β, as it provides the widest possible SED to account for the excess emission at ∼30 μm. The best-fit β value in the Casey (2012) models also depends on the pivot wavelength assumption (λ c ), which determines the transition point between the MBB and powerlaw components. λ c is fixed in our fits based on the suggested value in Casey (2012) that is derived from a sample of GOALS galaxies. Therefore, the validity of this assumption for our galaxies is in question, which in turn affects the best-fit β. These degeneracies can only be diminished by a better sampling of the IR SED. In short, the best-fit emissivity index of 1 in our subsolar-metallicity fits is a lower limit on the intrinsic emissivity, which is likely lower than that of the solarmetallicity galaxies. Although it is common practice to assume the Galactic emissivity index of β = 2, it is shown in the resolved studies of LMC that the intrinsic β can be lower than 2 .
Opacity coefficient κ 0 . The absorption cross section per dust mass or dust mass opacity coefficient is the main source of uncertainty in dust mass estimates from FIR/submillimeter wavelengths. The absorption cross section is a function of wavelength in the form of k k = , where κ 0 is the normalization at a reference wavelength λ 0 . The reference normalization value is notoriously uncertain. It depends on the grain composition, size distribution, and many other factors that make the results of models that are even based on the same data set significantly different from each other. As a result, dust masses derived from different models can vary by up to a factor of ∼3. In addition to the β dependence in the functional form of κ, κ 0 itself depends on β Bianchi 2013). Grains with smaller emissivity index have larger submillimeter opacity. Therefore, the assumption of β in the dust model (or the MBB functional shape) should be consistent with the assumption of β that the κ 0 is based on. In fact, Bianchi (2013) shows that if β is kept consistent, dust masses are not significantly dependent on the assumption of β. Another source of uncertainty in κ 0 is the possible variations in the ISM and dust emission properties of the systems under study compared to those in the local MW, on which the κ 0 calibrations are often based (Dwek et al. 1997;Li & Draine 2001;Draine 2003;Zubko et al. 2004). In these studies, the absorption cross section per H atom is derived from the emission properties of the MW cirrus, and then converted to an absorption cross section per dust mass by adopting a MW gas-to-dust ratio. Bianchi et al. (2019) uses a different method (also used by Clark et al. (2016)) to derive dust masses from integrated measurements of gas masses, metallicities, and FIR observations for a local sample from the DustPedia project. The study shows a large range of κ 0 values that vary galaxy-to-galaxy, and are mildly dependent on metallicity. Their results are conditional on the assumptions made for the fraction of metals