Stellar mass is not the best predictor of galaxy metallicity The gravitational potential–metallicity relation Φ ZR

Context. Interpreting the scaling relations followed by galaxies is a fundamental tool for assessing how well we understand galaxy formation and evolution. Several scaling relations involving the galaxy metallicity have been discovered through the years, the fore-most of which is the scaling with stellar mass. This so-called mass–metallicity relation is thought to be fundamental and has been subject to many studies in the literature. Aims. We study the dependence of the gas-phase metallicity on many di ﬀ erent galaxy properties to assess which of them determines the metallicity of a galaxy. Methods. We applied a random forest regressor algorithm on a sample of more than 3000 nearby galaxies from the SDSS-IV MaNGA survey. Using this machine-learning technique, we explored the e ﬀ ect of 148 parameters on the global oxygen abundance as an indicator of the gas metallicity. Results. M (cid:63) / R e , as a proxy for the baryonic gravitational potential of the galaxy, is found to be the primary factor determining the average gas-phase metallicity of the galaxy ( Z g ). It outweighs stellar mass. A subsequent analysis provides the strongest dependence of Z g on M (cid:63) / R 0 . 6 e . We argue that this parameter traces the total gravitational potential, and the exponent α (cid:39) 0 . 6 accounts for the inclusion of the dark matter component. Conclusions. Our results reveal the importance of the relation between the total gravitational potential of the galaxy and the gas metallicity. This relation is tighter and likely more primordial than the widely known mass–metallicity relation.


Introduction
The gas-phase metallicity (Z g ) of local galaxies has proven to be a key observable for numerical simulations and theoretical models.As a consequence of recurrent star formation processes, the amount of metals in galaxies increases gradually.In this way, every internal and external process that a galaxy experiences leaves imprints on the chemical distribution of its gas.This parameter has been widely confirmed as a powerful tool for improving our knowledge of the formation and evolution of galaxies (Mosconi et al. 2001;Lia et al. 2002;Brooks et al. 2007;Oppenheimer & Davé 2008;Davé et al. 2011;Pilkington et al. 2012;Mollá et al. 2019;Spitoni et al. 2019, among many others).
Many works have reported significant correlations both globally and locally between the gas metallicity and other galaxy properties, such as the stellar mass (M ⋆ , Lequeux et al. 1979;Tremonti et al. 2004;Sánchez et al. 2017), the star formation rate (SFR, Mannucci et al. 2010;Sánchez-Menguiano et al. 2019;Sánchez Almeida & Sánchez-Menguiano 2019), the stellar age (Lian et al. 2015;Sánchez-Menguiano et al. 2020), the rotation velocity (Garnett 2002;Pilyugin et al. 2004), the gas mass fraction (Bothwell et al. 2013;Barrera-Ballesteros et al. 2018), and the galaxy size (Ellison et al. 2008;Sánchez Almeida & Dalla Vecchia 2018).Of all galaxy properties, M ⋆ has been shown to present the strongest correlation with Z g , following the so-called mass-metallicity relation (MZR).As the stellar mass increases, the metallicity increases, until the relation flattens at high masses (e.g.Tremonti et al. 2004;Kewley & Ellison 2008;Sánchez et al. 2013;Wu et al. 2016;Barrera-Ballesteros et al. 2017;Sánchez et al. 2017Sánchez et al. , 2019;;Curti et al. 2020).The MZR is found to exist up to z ∼ 3.5 with a similar shape, although it evolves with redshift such that high-z galaxies are less enriched than local ones for a given stellar mass, especially at low masses (Maiolino et al. 2008;Mannucci et al. 2009;Yabe et al. 2012;Maier et al. 2014;Zahid et al. 2014;Sanders et al. 2015;Hidalgo 2017).This relation has been proposed to be the result of the interplay of many processes such as metal removal by outflows, dilution by metal-poor gas infall, enrichment by previous episodes of star formation, or simple evolution by the so-called downsizing, where most massive galaxies exhaust their gas reservoir faster (Maiolino & Mannucci 2019).
Despite the relevance confirmed for the MZR in the literature, a recent study by D'Eugenio et al. (2018) suggested that gas metallicity correlates more tightly with the average gravitational potential (estimated as Φ = M ⋆ /R e , where R e is the effective radius of the galaxy) than with either the stellar mass or the average mass surface density (Σ ⋆ ≡ M ⋆ /R 2 e ).Using a volumelimited sample drawn from the publicly available Sloan Digital Sky Survey Data Release 7 (SDSS DR7), their conclusions were based on (i) a lower scatter in the relation, (ii) a higher Spearman rank correlation coefficient, and (iii) a weaker residual trend with R e .The three relations were explained in terms of metal enrichment (driven by the gas fraction, and therefore Σ ⋆ ) and metal loss (proportional to the stellar mass).Alternatively, they could emerge as a consequence of an existing local relation between Z g and the escape velocity (e.g., reported in Barrera-Ballesteros et al. 2018), a local quantity directly related with the global gravitational potential.In this way, the potential-metallicity relation would be the primary relation, and the MZR and the ΣZR would arise from the local relation between Z g and Φ.This result contradicts another recent study by Baker & Maiolino (2023), who find a tighter relation between Z g and M ⋆ than between Z g and the dynamical mass (M dyn ) or Z g and the gravitational potential (estimated as Φ = M dyn /R e ).We note, however, that dynamical masses were derived via Jeans anisotropic modelling, which is a complex method that might increase the final scatter and thus undermine its importance.
In general, studies focus on investigating particular relations between the gas metallicity and other galaxy properties, but we lack works that address the issue in a holistic approach, involving a large number of parameters in the analysis.This is key for assessing with accuracy which property predicts Z g best, that is, which property yields the tightest relation with Z g , and so more probably is the primary relation.In this regard, the recent study by Alvarez-Hurtado et al. (2022) explored the impact of 29 physical parameters on the gas metallicity for a sample of 299 starforming galaxies from the CALIFA survey (Calar Alto Legacy Integral Field Area, Sánchez et al. 2012).The results of the analysis indicate that the stellar mass is the physical parameter that traces the observed Z g better.However, they did not include the gravitational potential as a parameter in the analysis.
This paper describes our work to provide a robust answer to the question.In order to do this, we applied a random forest regressor algorithm on a sample of more than 3000 nearby starforming galaxies from the MaNGA survey (Mapping Nearby Galaxies at Apache Point Observatory, Bundy et al. 2015).Using this technique, we explore the effect of 148 parameters on the global oxygen abundance of the galaxies as an indicator of the gas metallicity.We include in the model all properties that have previously been reported to strongly correlate with Z g , such as M ⋆ , Φ, Σ ⋆ , and M gas .
The paper is organised as follows.Section 2 briefly presents the MaNGA data, and the description of the procedure we followed in the analysis is given in Section 3, including the physical parameters of the model (Section 3.1), the criteria we adopted for the sample selection (Section 3.2), and an overview of the random forest algorithm (Section 3.3).The outcome of the random forest is described in Section 4.1, and a subsequent analysis including different combinations of M ⋆ and R e is described in Section 4.2.All the results are discussed in Section 5, and Section 6 compiles and summarises the main conclusions of the work.Finally, supplementary material is included in Appendices A and B. Appendix A presents the complete list of all the galactic parameters included in the model.Appendix B reproduces the analysis using alternative calibrations to estimate Z g .

MaNGA data
We took advantage of the large statistics provided by the MaNGA survey (Bundy et al. 2015), a now complete project based on integral field spectroscopy (IFS) and part of the fourthgeneration SDSS (SDSS-IV).With all the data publicly released (DR17, Abdurro'uf et al. 2022), MaNGA has gathered spatially resolved information for 10 010 galaxies up to redshift ∼ 0.15.
The data were collected using the two BOSS spectrographs mounted on the Sloan 2.5 m telescope at Apache Point Observatory (Gunn et al. 2006).The 17 simultaneously available fibre bundles are distributed in five different hexagonal configurations, whose field of view (FoV) varies from 12.5 ′′ to 32.5 ′′ in diameter (Drory et al. 2015).The covered wavelength range spans from 3600 Å to 10300 Å, with a nominal resolution of λ/∆λ ∼ 2100 at 6000 Å (Smee et al. 2013).The resulting spectra for each sampled spaxel of 0.5 ′′ × 0.5 ′′ present a final spatial resolution of FWHM ∼ 2.5 ′′ , which corresponds to a physical resolution of ∼ 1.5 kpc at an average redshift of 0.03.
Details of the MaNGA mother sample, the survey design, the observational strategy, and the data reduction are provided in Law et al. (2015), Yan et al. (2016), Law et al. (2016), Wake et al. (2017), andLaw et al. (2021).

Galaxy parameters
In order to determine the galactic property that best predicts the gas metallicity, we explored the effect of 148 parameters.These parameters were extracted from the pyPipe3D Value Added catalogue (VAC, Sánchez et al. 2022), which is publicly accessible through the SDSS-IV VAC website 1 .This VAC contains the dataproducts of pyPipe3D for the full DR17 MaNGA sample (Lacerda et al. 2022, see also Sánchez et al. 2016b,a).This analysis pipeline was developed to characterise the properties of the stellar populations and the ionised gas.The catalogue comprises a single table with the integrated values, those measured at the effective radius, and the radial gradients of different properties (e.g.stellar mass, star-formation, oxygen and nitrogen abundances, or dust attenuation), as well as a FITS file per galaxy with the corresponding spatially resolved quantities.
For this study, we focused on the global galaxy properties provided in the VAC table.This table comprises 304 parameters (and in most cases, their associated error estimates), many of which correspond to different line intensity values or ratios (e.g. the logarithm of the [NII]6583/Hα line ratio in the central 2.5 ′′ aperture).These quantities were discarded from the study because including them would result in obvious dependences of some galaxy properties and the emission lines from which they are derived (e.g. between the star formation rate and the Hα intensity).In addition, as a proxy for the global gas metallicity (Z g ), we used the oxygen abundance (O/H) measured at one disc effective radius (R e ) 2 and determined from the empirical calibration proposed by Marino et al. (2013, hereafter M13) for the O3N2 index.Other Z g calibrators were employed and led to results similar to those described in Sec.4.1 for O3N2 (see Appendix B for a deeper discussion).All these additional measurements of O/H were also discarded from the main analysis for clear reasons.Finally, the ionisation parameter, the nitrogen abundance, and the N/O abundance were also ignored because their determination involves line ratios in common with the O/H derivation.All in all, the final selection of galaxy properties available for the analysis (N par ) comprises 148 parameters.Table 1 compiles the most representative galactic properties, and the complete list is detailed in Appendix A.

Sample selection
The galaxy sample we analysed was drawn from the 10 010 galaxies comprising the MaNGA mother sample.In order to select the galaxies, we adopted two simple criteria: Galaxies have to meet the required quality standards of the analysis pipeline (i.e. a QCFLAG field equal to zero in the pyPipe3D VAC table; see section 4.5 of Sánchez et al. 2022 for details), and they must provide values for all of the analysed parameters.The second criterion is quite restrictive.For instance, it removes early-type systems and galaxies in general with little or no ionised gas (for which gas properties such as the oxygen abundance or the dust extinction cannot be derived).The final sample contains 3 430 galaxies.This number is high enough to ensure the statistical significance of the results.However, we note that the sample is biased towards large star-forming galaxies.

Random forest regressor
We applied a supervised machine-learning technique known as random forest (RF) regressor (Breiman 2001).Through a combination of decision trees, the algorithm finds the input features (the galaxy properties in our case) that contain more information on the target feature (Z g ) and creates a model to predict it by defining a set of conditions on the values of the input features.RF regressors have been extensively used in astronomy in the past decade with high levels of success (e.g.Carliles et al. 2010;Carrasco Kind & Brunner 2013;Sánchez-Menguiano et al. 2019;Moster et al. 2021, among many others).The algorithm was implemented using the scikit-learn package for Python (Pedregosa et al. 2011).We briefly describe the basic steps we followed to run the algorithm, but we refer to the scikit-learn User Guide documentation for more details of the complete algorithm implementation3 .We first randomly split the sample of 3 430 galaxies into two subsets: One set was used to create the model (training sample; 2 572 galaxies, corresponding to two-thirds of the sample), and the other set was used to evaluate its performance (test sample; 858 galaxies, comprising the remaining one-third of the sample).For the training sample, the selected set of galaxy properties was provided to the algorithm to train the model (these are the predictors).Z g was considered the target feature that was to be predicted by the RF model (this is the solution).The algorithm requires not only the predictors, but also the solutions (called labels) to train the model.When the model was trained, we applied it to the test sample, obtaining a set of predictions for Z g .The comparison of these predictions with the measured values provides an estimate of the model precision.
Before training the model, we selected the values of the parameters in the algorithm (called hyperparameters, HP) that optimise its performance.The HPs include the number of trees in the forest (n), the number of randomly selected features to consider in each split (m), the maximum depth of the trees (max depth ), the minimum number of samples required in each split (min split ), and the minimum number of samples that remain at the end of the different decision tree branches (min lea f ).We set m = N par = 148 to fully disentangle the relative interdependences among the input parameters, allowing the algorithm to select any of them at each split.If we choose m < N par , a variable containing significant information on the target will be replaced by any correlated parameter when it is not included among the randomly selected features to consider in a split.This will falsely reduce the importance of this variable in the model.By setting m = N par , any small difference in the performance of two correlated parameters will be detected by the RF, breaking possible degeneracies in the relations of the parameters.We refer to Bluck et al. (2022, appendix B.2) for detailed tests of how well RF can perform with inter-correlated data.Nevertheless, this decision must be balanced with a careful choice of the other HPs to avoid overfitting the data.In order to select the remaining HPs, we used the fivefold cross-validation (CV) method (James et al. 2013), which consists of splitting the training sample into five subsets, called folds.We trained the model on a different combination of four folds and evaluated it on the remaining fold.The reported performance measure was then the average of the values computed in the loop.We performed 100 iterations of the entire 5-Fold CV process, each time using different values for the HPs.We selected the HP values that achieved the highest average performance across the five folds.In our case, this resulted in n = 700, max depth = 30, min split = 4, and min lea f = 2.We note, however, that the results we obtained are not very sensitive to the selected values for the HPs.
Finally, using the selected HPs, we trained the model on the full training sample, and then evaluated it on the test sample.Figure 1 represents the predicted Z g by the RF algorithm against the measured values for the training (blue circles) and test (brown triangles) samples.The test sample follows a trend around the one-to-one relation (dashed line), similar to the training sample (R 2 score of 0.87 and 0.98, respectively), showing that the algorithm is able to predict the Z g of galaxies that are not used to train the model with comparable accuracy.This ensures that the model is not affected by overfitting.The median error in the predictions (predictions-targets) is 0.017 dex, which is about the same as the observational errors propagated into Z g (∼ 0.011 dex, which does not consider the systematic error in the calibrator).

Feature importance from random forest
The model obtained by the RF algorithm allows us to investigate the connection between different galaxy properties (the features) and their overall gas metallicity (the target).Although the model was fed with more than 100 input parameters that characterise the galaxies, only a few of them are typically important for an accurate prediction for the target.The remaining parameters have little to no impact on Z g .
In order to find the most relevant features, the RF provides a measurement of the relative importance of each feature in predicting the solution.In rough outlines, the so-called feature importance is a measure of how effective the feature is at reducing The features are ranked by decreasing order of importance.The shaded area represents the standard deviation of the mean relative importance among the 50 runs.The meaning of the labels is detailed in Table 1.
variance when the variables are split along the decision trees.The higher the value, the more important the feature.Figure 2 shows the relative importance of the first ten input features (see Table 1 for a description of the labels), ranked by decreasing order of the importance value.In order to test the stability of these values, we ran the RF algorithm 50 times, each time with a different randomly selected training sample.The figure shows the average trend of the values for the 50 realisations together with the standard deviations (shaded area).To our initial surprise, the baryonic gravitational potential (Φ baryon ) was the primary factor determining the global gas metallicity, with a mean importance value of ∼ 0.51, which is well above that of the other values.The second most relevant parameter was the mass-weighted stellar metallicity measured at 1 R e (MW-[Z/H] Re ), which presents a relative importance of ∼ 0.08.The remaining properties had an importance value below 0.05 and are therefore of little or no relevance for determining the gas metallicity.This includes the stellar mass, which occupies the third position in the ranking, with a relative importance of ∼ 0.04.
The importance value of Φ baryon is much higher than that of the stellar mass, which means that the gravitational potential is significantly more efficient than the mass to decrease the variance when the variables are split along the decision trees.To confirm this, we ran the RF with the gravitational potential as the only parameter in the model.The root mean square error (RMSE) of the difference between the predictions (modelled Z g ) and the targets (measured Z g ) is 0.052 ± 0.001 (50 runs of the algorithm).If instead we use the stellar mass as the only parameter in the model, the RMSE increases up to 0.054 ± 0.001.This test supports the finding that Φ baryon better describes the gas metallicity than M ⋆ .We would also like to note that although the relative importance of the stellar mass is lower than MW-[Z/H] Re , we must not forget the correlation between Φ baryon and M ⋆ , which might cause the reduced importance of M ⋆ in the model.We also ran the RF with the stellar metallicity as the only parameter in the model.We obtained an RMSE of 0.056 ± 0.001, confirming that although M ⋆ is the third most relevant parameter in the original model, it regulates the gas metallicity better than MW-[Z/H] Re .In order to visualise the shape of the dependence of the global gas metallicity on the stellar mass and on the baryonic gravitational potential, we present in Figure 3 Z g as a function of both Φ baryon (hereafter called ΦZR, top) and M ⋆ (the so-called MZR, bottom).The shape of the relations is very similar in both cases: Z g increases when M ⋆ and Φ baryon increase, although the flattening at the high end is more pronounced in the MZR.Regarding the dispersion, we find ΦZR to be slightly tighter than the MZR, although the difference is small (2%).In any case, these results agree with the conclusions derived from the RF analysis, supporting the finding that Φ baryon is a slightly better predictor of Z g than M ⋆ .
4.2.Gravitational potential, or the combined effect of stellar mass and galaxy size The ratio M ⋆ /R e was used in the analysis as a proxy for the galactic gravitational potential at one effective radius.This firstorder approximation ignores the effects of the dark matter (DM) halo on the potential in these inner regions, which is assumed to be dominated by the baryonic component (which in turn is dominated by the stellar component).We address the role of DM below.
The correlation between gas metallicity and both stellar mass and galaxy size has been described in numerous studies (e.g.Ellison et al. 2008;Sánchez Almeida & Dalla Vecchia 2018;Sánchez et al. 2017Sánchez et al. , 2019)).Since Φ baryon is estimated as M ⋆ /R e , the fact that Φ baryon resulted as the primary factor determining Z g could simply reflect therefore the combined effect of both stellar mass and galaxy size on Z g .In this case, the combination of stellar mass and radius that best fits the metallicity could be in the form of M ⋆ /R α e , with the exponent α similar to but not necessarily equal to one.In order to further test this hypothesis, we ran the RF again, and this time included 20 new input parameters in the shape of M ⋆ /R α e , with α = 0.1, ..., 2, in steps of 0.1 (α = 1, corresponding to Φ baryon , and α = 2 to Σ ⋆ ).If another value of α 1 fits better than Φ baryon , it could suggest that the real predictor of the Z g is the combined effect of stellar mass and galaxy size and not the baryon gravitational potential of the galaxy alone.
The outcome of this test is shown in Table 2, that is, the average importance values (out of 50 realisations) of the new M ⋆ /R α e features included in the RF, together with their position in the ranking list.Only the first five parameters are displayed.Interestingly, Φ baryon (α = 1) is replaced by M ⋆ /R 0.6 e as the parameter with the strongest effect on Z g .This result seems to suggest that the gravitational potential set by the baryons is indeed not the best predictor of Z g , but a particular combination of mass and size, M ⋆ /R 0.6 e .However, as we pointed out above, in our approximation, we ignored the effects of the DM halo.α ≃ 0.6 might account for the inclusion of the DM component in the whole gravitational potential of the galaxy.In this way, M ⋆ /R 0.6 e would be a better tracer of the real Φ than M ⋆ /R e , and our main conclusion would remain: The total gravitational potential would be the galactic property with the strongest impact on Z g .This seems indeed to be the case, as we discuss below.
In order to investigate the conjecture, we worked out the expression for the fraction of the DM, that is needed for the logarithm of the total potential to scale with the parameter used in the RF, that is, the logarithm of where a and b are two scaling constants, and M T is the total mass, composed of baryons, M baryon , and DM, M DM .Assuming a functional dependence of f DM of the type with B an arbitrary factor, we find after some algebra that the functional form in Eq. ( 3) is consistent with Eq. ( 2) provided We considered M baryon ≃ M ⋆ , which is a good approximation for most MaNGA galaxies (Lin et al. 2020).The functional form in Eq. ( 3) was chosen because it seems to be followed by the galaxies produced in cosmological numerical simulations, as we argue below.Figure 4 uses Eqs. ( 3) and ( 4) to show the baryon fraction for Eq. ( 2) to hold, depending on the value of α (the coloured lines).The figure also includes the theoretical expectation for f DM (the symbols) as predicted by the Illustris-TNG100 simulations (Lovell et al. 2018) at redshift 0, a prediction that we extracted from fig. 5 in Nestor Shachar et al. ( 2023).The predictions from two additional simulations by Oser et al. (2010) and Remus et al. (2017) are also shown, reflecting that the slope of the relation between f DM and Σ ⋆ given by β might slightly change depending on the set of simulations that is used.Figure 4 illustrates that α = 0.6 (the solid orange line) matches the theoretical expectation from Illustris-TNG100 (grey symbols and shaded area) extremely well, except for extreme surface densities.The match is encouraging because no freedom or fitting is shown in the representation in the figure (except for a shift in the y-axis given by the constant B).In any case, the match for α = 0.6 is much better than 0.5 or 0.7.We interpret this match as evidence that M ⋆ /R 0.6 e traces Φ, and therefore, evidence that our RF reveals a true relation between the total gravitational potential of the galaxy and the gas metallicity, which is tighter and more primordial than the MZR or the baryonic ΦZR.

Discussion
The search for relations between integrated physical properties in galaxies has been widely addressed in the literature.In particular, scaling relations involving the gas-phase oxygen abundance (as a proxy for the gas metalllicity, Z g ) have received much attention because it is a tracer of the overall chemical evolution of galaxies, modulated by the inflow or outflow of gas in different regions (Sánchez Almeida et al. 2014;Maiolino & Mannucci 2019;Sánchez 2020).For this reason, the dependence of Z g on other properties (or vice versa) can provide important insights into galaxy formation and evolution.
In this study, we analysed the global properties of a sample of more than 3000 nearby galaxies from the SDSS-IV MaNGA survey (Bundy et al. 2015).The aim of this analysis is to evaluate the galactic property that is the best predictor of Z g , and to do this, we explored the effect of 148 parameters extracted from the pyPipe3D Value Added catalogue (Sánchez et al. 2022).We include the parameters that were previously reported to strongly correlate with Z g , such as M ⋆ , Φ, Σ ⋆ , and M gas .We used the oxygen abundance measured at 1 R e as a proxy for the global gas metallicity, determined with the O3N2 calibration proposed by Marino et al. (2013).
For this analysis, we applied a RF regressor, which is a supervised machine-learning technique that has been proven successful in the past in searches for the primary relations between galaxy properties (e.g.Sánchez-Menguiano et al. 2019;Baker et al. 2023a,b).Our trained model was able to predict Z g with an average error (defined as the difference between the predictions and the targets) of about the propagated uncertainties associated with the measured values.
Based on the relative importance of each galaxy property in predicting Z g , we obtained that the baryonic gravitational potential (defined as Φ baryon = M ⋆ /R e ) is the most relevant parameter of the model, followed by MW-[Z/H] Re and M ⋆ , with an importance value more than five times lower than that of Φ baryon .This result was supported by the outcome of a subsequent run of the RF with only one parameter in the model.When that parameter was M ⋆ (MW-[Z/H] Re ), the RMSE of the differences between the predictions and the targets was 0.054 (0.056) ± 0.001, whereas this value was reduced to 0.052 ± 0.001 for Φ baryon .The difference between these values and that found for the complete model (0.029) suggests that, assuming there is no overfitting, other secondary parameters play a significant role in shaping the gas metallicity.This will be investigated in a future study (Sánchez-Menguiano et al. in prep.).Furthermore, the scatter of the relations between Z g and Φ baryon (ΦZR) and between Z g and M ⋆ (the so-called MZR) shows that the ΦZR is slightly tighter than the MZR (2%).Although the differences are small in both cases, these tests support the finding that Φ baryon is better suited than M ⋆ to describe the gas metallicity.
These results are representative of the whole population of analysed galaxies, which spreads over a wide range of stellar masses (8.5 < log(M ⋆ /M ⊙ ) < 11.5).Although our mass coverage is not homogeneous (there is an excess of massive galaxies compared to low-mass systems), we show in Appendix C that this bias does not affect the analysis because using a subsample of galaxies with a uniform mass distribution (constant number of objects per mass bin) leads to the same results.Restricting the analysis to a more limited mass range, however, might lead to different conclusions.For instance, because both the MZR and the ΦZR saturate at high masses, a sample that is highly dominated by massive systems and with a poor coverage of low-mass galaxies could reduce the effect of both M ⋆ and Φ baryon on Z g (because Z g remains almost constant, while both parameters increase).It is therefore important to rely on a wide mass range that allows covering a wide dynamical range for Z g in order to detect true primary and secondary relations.
It is interesting to note that the stellar mass estimated from photometry (within the MaNGA FoV) based on the use of the M/L ratio and the relations provided in Bell & de Jong (2000) is a better predictor of Z g than the mass derived from the spectroscopic analysis of the stellar populations performed by pyPipe3D.Whereas the first obtains an importance value of 0.043, the later obtains just 0.001.Another estimation of the integrated stellar mass from the NSA catalog provides an intermediate value (0.010).Photometric stellar masses are considered to be less accurate, but are probably more precise than estimates based on the spectroscopic analysis performed by pyPipe3D.For a more detailed comparison of the mass estimations, we refer to Sánchez et al. (2022).
Due to the well-known discrepancies between the oxygen abundances derived using different strong-line calibrators (for a review on the topic, see Kewley et al. 2019), we reproduced the analysis employing 18 alternative O/H estimators.Consistent with the use of O3N2-M13, in 15 out of the 18 cases, we obtained that Φ baryon is the best predictor of the global gas metallicity.This strengthens the validity of the result, which is independent of the method used to estimate the abundances.In the remaining three, the discrepancies might be due to the small dynamical range of the calibrators (more details can be found in Appendix B).This is not the first study to find that the correlation of the gas metallicity with the baryonic gravitational potential is stronger than with the stellar mass.Using a volume-limited sample drawn from the publicly available SDSS DR7, D 'Eugenio et al. (2018) analysed the relation between Z g and three galaxy properties: Φ baryon , M ⋆ , and average Σ ⋆ .To minimise aperture bias, the authors also defined three aperture-matched subsamples to ensure an homogeneous coverage of the physical size of galaxies from the fixed-aperture SDSS fibres: R fib /R e = 0.5, 1.0, and 1.5.For the three subsamples, they find that ΦZR had the lowest scatter and the highest Spearman correlation coefficient.In addition, this relation had the lowest residual trends with galaxy size.Based on these results, the authors concluded that of the three parameters, Φ baryon is the best predictor of Z g .According to D' Eugenio et al. (2018), there are two possible explanations for this.The correlation of Z g and Φ might arise from the combination of metal loss (which depends on the value of the escape velocity, which in turn is related to M ⋆ ) and metal enrichment (driven by f gas , and therefore Σ ⋆ ).Each of these effects induces a correlation between Z g and M ⋆ /R n e , with n = 0 and n = 2, respectively, and their combination could produce an overall correlation with an intermediate value of n.However, because the metallicity of low-mass galaxies is highly affected by outflows, Z g is predicted to correlate best with M ⋆ at the low-mass end of the relation.Similarly, because the escape fraction for highmass galaxies is small, Z g should correlate best with Σ ⋆ at the high-mass end of the relation.These predicted trends are not observed in their analysis, and therefore, the authors pointed to another explanation for the ΦMR: a direct physical link between the average depth of the gravitational potential and the average metallicity, which is a consequence of an existing local relation between Z g and the escape velocity (this local quantity is di-rectly related with the global gravitational potential).We note that this local relation does exist and has already been reported in Barrera-Ballesteros et al. 2018.The first explanation would imply that the ΦMR is the result of the already known MZR and ΣZR, whereas the latter indicates that the potential-metallicity relation is the primordial relation, and the MZR and the ΣZR would arise from the local relation between Z g and Φ.
The first explanation provided by D'Eugenio et al. ( 2018) could be further tested by re-applying the RF and include different parameters in the model in the shape of M ⋆ /R α e .We tried 20 combinations with α = 0.1, ..., 2, in steps of 0.1.Interestingly, Φ baryon (α = 1) is replaced by M ⋆ /R 0.6 e as the parameter with the strongest effect on Z g .When the alternative calibrations were applied to determine Z g , the coefficients α = 0.7 and α = 0.8 in some cases replaced α = 0.6 as those occupying the highest position in the importance ranking.This may suggest that the ΦMR is indeed not the primordial relation, but results from the MZR and ΣZR with a particular combination of the effects of metal loss and metal enrichment.
However, we provide here another explanation to justify the result.We have ignored the effects of the DM halo in the analysis so far.α ≃ 0.6 might account for the inclusion of the DM component in the whole gravitational potential of the galaxy.In Figure 4 we showed that a scale of the form M ⋆ /R 0.6  e for the total gravitational potential matches the theoretical relation between the DM fraction and the baryonic surface density predicted in the Illustris TNG cosmological numerical simulation very well (Nestor Shachar et al. 2023).In this way, M ⋆ /R 0.6 e would be a better tracer of the real Φ than M ⋆ /R e , and therefore, the total gravitational potential (and not its baryonic component) would be the galactic property that predicts Z g best.The substitution of α = 0.6 by α = 0.7 or α = 0.8 for the parameter with the strongest effect on Z g when using other abundance calibrations can be accounted for by the large scatter in the prediction by Nestor Shachar et al. ( 2023) (shaded area in Fig. 4), and the fact that the slope of the relation might vary slightly depending on the simulations used (Oser et al. 2010;Remus et al. 2017).
A recent study by Baker & Maiolino (2023) also explored the relation between Z g and (i) M ⋆ , (ii) dynamical mass (M dyn , DMZR), and (iii) gravitational potential (estimated as Φ = M dyn /R e ) using MaNGA data.Based on three different methods (average dispersion, partial correlation coefficients, and RF), the authors reported that the gas metallicity primarily depends on the stellar mass, with little or no dependence on either dynamical mass or gravitational potential when the dependence on stellar mass is taken into account.However, their M dyn were derived via Jeans anisotropic modelling, which is a complex method with its own systematic problems that might increase the intrinsic scatter in DMZR and ΦZR with respect to MZR (e.g.Li et al. 2016;El-Badry et al. 2017;Read et al. 2021).We argue that our derivation of Φ baryon , although less accurate, produces more precise results that allow us to determine the intrinsic scatter in ΦZR, resulting in a tighter relation than the MZR.This is in contrast to the results reported by Baker & Maiolino (2023).

Conclusions
We studied the dependence of the gas-phase metallicity (Z g ) on a large set of galaxy properties, including parameters that have been reported in the literature to correlate strongly with Z g , such as the stellar mass (M ⋆ ), the baryonic gravitational potential (Φ baryon ), the average stellar mass surface density (Σ ⋆ ), or the gas mass (M gas ).In order to do this, we applied a RF regressor algorithm on a sample of more than 3000 nearby galaxies from the SDSS-IV MaNGA survey.Our results are listed below.
(i) M ⋆ /R e , as a proxy for Φ baryon , is found to be the primary factor determining Z g (Section 4.1; Figure 2).Because the gravitational potential regulates the ability of a galaxy to retain metals, it is not surprising that ΦMR is the primordial scaling relation involving Z g .
(ii) When we included a parameter of the type M ⋆ /R α e in the analysis, we find that α = 0.6 best correlates with Z g (Section 4.2; Table 2).We explored the effect of the DM halo on the gravitational potential, and we showed that a scale of the form M ⋆ /R 0.6 e for the total Φ matches the theoretical relation between the DM fraction and the baryonic surface density predicted in cosmological numerical simulations of galaxy formation very well (Section 4.2; Figure 4).This suggests that the total gravitational potential would be the galactic property that predicts Z g best.
(iii) In the analysis, we used the oxygen abundance measured with the O3N2 calibration by Marino et al. (2013) as an indicator of the gas metallicity.The use of alternative estimators still gives ΦMR as the primordial relation (Appendix B).The inclusion of the DM halo might affect the particular coefficient α, which could be better described by theoretical predictions from other simulations (Section 5; Figure 4).
(iv) We find evidence that Φ baryon alone cannot predict Z g and that other secondary parameters might play a significant role in shaping the gas metallicity (Section 5).This will be investigated in a future study (Sánchez-Menguiano et al. in prep.).In general, the coefficients α = 0.7 and α = 0.8 occupy the highest position in the ranking.The coefficient α = 0.6, which was the parameter with the strongest effect on Z g when estimated with O3N2-M13, presents the highest importance value for 3 out of the 15 calibrations, but it is the third parameter in the ranking overall when the results of all abundance indicators are averaged.In Sec.4.2 we showed that a scale of the form M ⋆ /R 0.6 e for the total gravitational potential matches the theoretical relation between the DM fraction and the baryonic surface density predicted in Nestor Shachar et al. ( 2023) very well.However, as we argued in Sec. 5, the scatter in the relation, as well as the fact that other simulations provide different slopes for the dependence of f DM with Σ ⋆ , mean that the interpretation is still reasonable that Φ, traced by the parameter M ⋆ /R α e , is the best predictor of Z g .We refer to that section for further discussions of the matter.

Fig. 1 .
Fig. 1.Global gas-phase metallicity -described by the oxygen abundance 12+log(O/H)-predicted by the RF algorithm vs. the measured values.The blue circles correspond to the training set, and the brown triangles represent the test sample.The dashed line indicates the oneto-one relation.

Fig. 2 .
Fig. 2. Average relative importance of the first ten input features in predicting the global gas metallicity for 50 runs of the RF algorithm.The features are ranked by decreasing order of importance.The shaded area represents the standard deviation of the mean relative importance among the 50 runs.The meaning of the labels is detailed in Table1.

Fig. 3 .
Fig. 3. Global gas metallicity as a function of the stellar mass (the socalled MZR, bottom) and the galactic gravitational potential (hereafter called ΦZR, top).The brown squares represent the mean Z g values in ten mass and Φ bins, respectively.The averaged binned values were fitted with a spline function (solid red lines).
Notes.α ranges from 0.1 to 2 in steps of 0.1.These values are averaged over 50 runs of the algorithm.The features are ranked by decreasing order of the importance value.Only the first five parameters are shown.See Sec.4.2 for more details.

Fig. 4 .
Fig. 4. Dependence of the dark matter fraction on the baryonic surface density for galaxies in the local Universe predicted by Illustris-TNG100 simulations (Nestor Shachar et al. 2023, figure5).This relation is compared with the one obtained from the five M ⋆ /R α e parameters with the highest relative importance in the RF.M ⋆ /R 0.6 e provides a good fit for the theoretical relation.Two additional relations for the dependence of f DM on Σ ⋆ from different simulations are also included(Oser et al. 2010;Remus et al. 2017).
these differences.Figure29ofSánchez et al. (2022) shows that while the other calibrators are clearly correlated with the abundances derived from H19 (arbitrarily selected as the fiducial calibrator), O3O2, R2, and O3S2 show a flatter distribution, which could bias the results and therefore dismiss their validity as abundance estimators for this work.For the 15 abundance indicators for which Φ baryon was found to be the most relevant parameter in the RF, we again ran the algorithm now including in the model 20 different combinations of stellar mass and galaxy size in the shape of M ⋆ /R α e , with α = 0.1, ..., 2 in steps of 0.1 (see Sec. 4.2 for more details).We show in Figure B.2 the resulting average importances (out of 50 realisations; the error bars represent the standard deviations) of the 6 α values that present the highest values overall.

Fig
Fig. B.1.Importance of the 15 features of the RF with the highest importance values as shown in Figure 2. The predicted Z g is estimated using different calibrations (coloured lines).See the caption of Figure 2 for more details.

Fig. C. 2 .
Fig. C.2.Average relative importance of the first ten input features in predicting the global gas metallicity for 50 runs of the RF algorithm.The features are ranked by decreasing order of importance.The shaded area represents the standard deviation of the mean relative importance of the 50 runs.The meaning of the labels is detailed in Table A.1.For this test, the uniform sample was used (see Figure C.1).

Table 1 .
Sánchez et al. (2022)sed parameters from the pyPipe3D VAC.Notes.The table only displays the most relevant analysed parameters (see Sec.4.1, Figure2) from the pyPipe3D VAC.The complete list is provided in Appendix A. We refer toSánchez et al. (2022)for more details of the derivation of the parameters.

Table 2 .
Relative importance of the M ⋆ /R α e features in the RF model.

Table A .
Sánchez et al. (2022)ed from the pyPipe3D VAC included in the analysis.whichthegalaxy has formed 95% of its stellar mass Notes.The MaNGA ID is included in the table only for identification.We refer toSánchez et al. (2022)for more details of the derivation of the parameters.Table B.1.Alternative calibrations to derive the oxygen abundance.