Toward a Better Understanding of Cosmic Chronometers: Stellar Population Properties of Passive Galaxies at Intermediate Redshift

We take advantage of the publicly available LEGA-C spectroscopic survey to measure the stellar population properties of 140 individual massive and passive galaxies at $z\sim0.7$. We develop and publicly release PyLick, a flexible python code to measure UV to near-IR spectral indices. With PyLick we study the H/K ratio as a new diagnostic based on the pseudo-Lick CaII H and K indices, and find that a cut in ${\rm H/K}<1.1$ can be used jointly with other criteria to select (or verify the purity of) samples of passive galaxies. By combining photometric and spectroscopic criteria, we select a reliable sample of passively evolving galaxies. We constrain single-burst stellar ages, metallicities $\mathrm{[Z/H]}$, and $\mathrm{[\alpha/Fe]}$ with an optimized set of Lick indices, exploring in detail the robustness of our measurement against different combinations. Even without imposing cosmological priors, the derived ages follow a clear trend compatible with the expected cosmological aging of the Universe. We observe no significant redshift evolution for the metal abundance with respect to the values derived at $z=0$, with median $\mathrm{[Z/H]}=0.08\pm0.18$ and $\mathrm{[\alpha/Fe]}=0.13\pm0.11$. Finally, we find positive correlations between $\log\mathrm{age}$, $\mathrm{[Z/H]}$, $\mathrm{[\alpha/Fe]}$ and the stellar velocity dispersion, with slopes of ($0.48\pm0.14$), ($0.26\pm0.17$), and ($0.23\pm0.11$), respectively; the small scatter of $<0.2$ dex points to rather homogeneous and short star formation histories. Overall, these results confirm and extend low-redshift findings of a mass-downsizing evolution. This work further strengthens the possibility of selecting pure samples of passive galaxies to be exploited reliably as cosmic chronometers to place independent cosmological constraints.


Introduction
The advent of deep spectroscopic surveys led to considerable progress in understanding galaxy formation and evolution over different cosmic epochs. The reason is that they enable us to combine two main observational techniques: look-back statistical studies of galaxy properties over different cosmic times, and the archaeological reconstruction of galaxy properties and star formation histories (SFHs) from spectroscopic data.
Galaxies with no or negligible levels of star formation (hereafter "passive" or "quiescent") are ideal laboratories to perform these studies because their stellar population is relatively simple to model (for a detailed review, see Renzini 2006). They are mostly found after the peak epoch of galaxy assembly (the so-called 'cosmic noon'), especially below z ∼ 1.5, when they dominate the stellar-mass density (Muzzin et al. 2013;Ilbert et al. 2013), and the number density of the most massive ones (M 10 11 M ) remains almost constant between z = 1 and z = 0 (Pozzetti et al. 2010;Moresco et al. 2013). On the contrary, at higher redshift, they constitute only a minor population (Cimatti et al. 2004;Daddi et al. 2005). Recent spectroscopic observations confirmed the existence of a few of these systems up to z ∼ 4 (Tanaka et al. 2019;Valentino et al. 2020;Santini et al. 2021) when the Universe was only ∼ 2 Gyr old, requiring an early intense star formation followed by a complete decline (known as quenching).
Many different methods are used to investigate their physical properties. An example is the study of the photometric spectral energy distribution (SED; e.g., Pacifici et al. 2016). Photometric observations can span a very large range of wavelengths (usually from UV to FIR), allowing a broad view of various physical quantities, such as stellar mass, star formation rate, stellar age, and metallicity. However, it is not possible to find precise constraints due to strong degeneracies, e.g., age-metallicity (Worthey et al. 1994). To break them, high signal-to-noise spectroscopy is needed. The wellestablished method of Lick indices (Faber 1973;Burstein et al. 1984;Worthey et al. 1994) involves the use of a set of absorption features, each one having a unique sensitivity to stellar ages and element abundances. Another main approach is based on the use of the entire spectral information (full spectral fitting; e.g., Cid Fernandes et al. 2005;Conroy 2013). All of these techniques independently contributed to establishing the downsizing scenario (first introduced by Cowie et al. 1996), according to which galaxy mass plays a major role in galaxy formation and evolution: massive galaxies are found to have evolved earlier and over shorter time scales than less massive ones. In the local Universe, this is supported by positive scaling relations between stellar ages, metallicities ([Z/H]), and α-element abundances ([α/Fe]) with galaxy mass (dynamical and stellar) found in large, high-quality spectroscopic samples of early-type galaxies (Kauffmann et al. 2003;Gallazzi et al. 2005Gallazzi et al. , 2006; Thomas et al. 2005Thomas et al. , 2010Treu et al. 2005;Conroy et al. 2014;McDermid et al. 2015;Scott et al. 2017;Siudek et al. 2017;Gallazzi et al. 2021).
The stellar metallicity, i.e. the amount of metals locked in stars, can shed light on the evolutionary stage of the stellar population, as well as on the external mechanisms that modify the chemical content of the interstellar medium. For instance, Peng et al. (2015) showed that local quiescent and star-forming galaxies form two distinct relations in the stellar-massmetallicity plane. The difference between their shape can be interpreted either as an evidence for 'strangulation' (i.e. the lack of new gas supply Peng et al. 2015;Trussler et al. 2021), or for shorter formation timescales coupled with galactic winds (Spitoni et al. 2017) as the main mechanism driving galaxy quenching. The relative α-element abundance with respect to iron is another key parameter. Core-collapsing massive stars are the main producers of α-elements (O, Mg, Si, Ca, Ti), polluting the interstellar medium over relatively short timescales (< 100 Myr). On the other hand, iron-peak elements (Fe, Cr) are primarily produced in Type Ia supernovae, which pollute the interstellar medium over longer timescales . For this reason, the mean stellar [α/Fe] has been traditionally used as a SFH timescale diagnostic de La Rosa et al. 2011). In addition to stellar population properties, the environment can in principle have a role in shaping galaxy evolution. However, its effects seem weak once the correlation with mass is removed (Thomas et al. 2010; La Barbera et al. 2014;McDermid et al. 2015;Trussler et al. 2021). To explore them in greater detail, very large samples of galaxies are needed (e.g., Bluck et al. 2020).
At higher redshift, studies of stellar population properties are more challenging and require deep nearinfrared spectroscopy. For this reason, they are mostly limited to samples of a few up to dozens galaxies (e.g., Jørgensen & Chiboucas 2013; Gallazzi et al. 2014;Lonoce et al. 2015Lonoce et al. , 2020Belli et al. 2019;Carnall et al. 2019;Kriek et al. 2019;Tacchella et al. 2021;Beverage et al. 2021), and/or require the stacking of different galaxy spectra (e.g., Choi et al. 2014;Onodera et al. 2015). In particular, detailed studies of galaxy ages and chemical abundances for individual galaxies at intermediate redshift have been presented in very few works and usually by assuming the age of a ΛCDM universe as the maximum age allowed in the stellar population analysis.
Beyond galaxy evolution studies, massive and passive galaxies encode valuable information about the underlying cosmological framework. In fact, as first proposed by Jimenez & Loeb (2002), it has been demonstrated that these objects can be used as cosmic chronometers to trace the differential age evolution of the Universe dt/dz, and to provide in this way a cosmologyindependent estimate of the expansion rate of the Universe H(z) = −1/(1 + z) dz/dt. While we refer to the literature for a detailed discussion of the method and of the systematics involved (Moresco et al. 2012b;Moresco 2015;Moresco et al. 2016Moresco et al. , 2020, we underline here that there are two key ingredients to be met. First, the best cosmic chronometers that have been studied are very massive and passively evolving galaxies; therefore, it is fundamental to accurately select these objects by carefully excluding any possible hints of starforming or young outliers. Second, it is crucial that in the differential age estimate dt no cosmological assumption is made. With the advantage of providing a cosmology-independent estimate of H(z), the cosmic chronometers method has also been considered in several cosmological studies to place constraints on various cosmological models and parameters (Moresco et al. 2012a;Seikel et al. 2012;Capozziello et al. 2014;Valken-burg et al. 2014;Sapone et al. 2014;Nunes et al. 2016;Solà et al. 2017;Moresco & Marulli 2017;L'Huillier & Shafieloo 2017;Yang et al. 2018;Haridasu et al. 2018;Gómez-Valent & Amendola 2018;Lin et al. 2020Lin et al. , 2021, with particular benefits over more standard cosmological probes (see e.g., Vagnozzi et al. 2021).
In this work, we take advantage of the deep spectroscopic observations of the second data release of the Large Early Galaxy Astrophysics Census (LEGA-C DR2; van der Wel et al. 2016;Straatman et al. 2018) at 0.6 z 1 to infer physical properties of individual passive galaxies without relying on any cosmological model or assumption. Our analysis is based on Pylick, a flexible Python tool to measure absorption features that we publicly release. It includes a wide set of indices already defined in the literature and also new diagnostics introduced and explored in this work for identifying passively evolving systems. Stellar population properties of each individual galaxy are derived with a Bayesian approach, adopting the simple stellar population (SSP) models of Thomas et al. (2011). The analysis of their trends with redshift and stellar velocity dispersion will allow us to understand the individual and median properties of passive galaxies over different cosmic epochs and explore the underlying cosmology.
The work is organized as follows. Section 2 gives an overview of the dataset, selection process, spectral index measurements, and main observational properties. Section 3 presents background information on the models and the stellar population analysis. The main science results are presented and discussed in Section 4. A summary is presented in Section 5.
In some cases, as reference values and mostly for illustrative purposes, we will use some theoretical relations based on cosmological models; for these cases, we adopt a '737 cosmology' (with H 0 = 70 km s −1 Mpc −1 , Ω m = 0.3, and Ω Λ = 0.7).

Data
The data used in this study are drawn from the second data release of LEGA-C, a recently completed ESO Public Spectroscopic Survey targeting ∼3000 K s -bandselected galaxies at 0.6 z 1 in the COSMOS field. The observations were carried out with the Visible Multi-Object Spectrograph (VIMOS) on the VLT at Paranal Observatory. The flux-calibrated spectra span a wavelength range of 6300 < λ < 8800Å with a spectral resolution of R ∼ 3500, and a median signal-to-noise ratio (S/N) of ∼ 15 per pixel (0.6Å). Spectra are obtained with 1 wide slits, corresponding to ∼ 7 kpc at these redshifts.  (Straatman et al. 2018). In our analysis, we consider the galaxy spectra, as well as the measurements of redshift (z), observed stellar velocity dispersion (σ ), and [O ii]λ3727 emission-line flux. For the main analysis, we do not use the spectral indices measurements provided in the catalog, but instead determine our own line strengths from the spectra after matching the stellar population models resolution. This also allows us to extend the measurements to redder indices up to ∼ 5000Å (see Section 2.3).
The LEGA-C set of Lick indices will be used to validate our measurement code PyLick in Appendix A.
We cross-match the LEGA-C sample with the COS-MOS2015 catalog (Laigle et al. 2016) using a search radius of 1 , to complement the spectroscopic information with photometric data, including bands NUV, r, and J, as well as stellar masses M and (specific) star formation rates (sSFR = SFR/M ) derived through SED fitting. Finally, we use morphological information from the Zurich Estimator of Structural Types catalog (Scarlata et al. 2007), based on principal component analysis of the surface brightness profiles. Our parent sample is selected requiring good quality spectra (see Straatman et al. 2018 for spectra quality flags) and available NUV, r, and J absolute magnitudes. In this way, we end up with 1622 sources.
However, these different criteria do not perfectly overlap (Renzini 2006). In fact, selections based on a single criterion are not stringent enough to reduce the contamination from star-forming outliers, with a percentage of contamination up to 10%-30% depending on the considered criterion. On the other hand, a combination of different criteria, maximizing the overlap of complementary information (photometric and spectroscopic), is significantly more effective in selecting a pure sample  Ilbert et al. (2013) and Mignoli et al. (2009), respectively. We use black colors for the criteria also adopted in this work, while gray lines are for illustrative purposes only. The hatched region (±0.05 mag with respect to the NUVrJ cut) indicates the location of green-valley galaxies (Davidzon et al. 2017 (Franzetti et al. 2007;Moresco et al. 2013). In this analysis, with the aim of studying the physical properties of cosmic chronometers, we are interested in having the purest possible sample, minimizing as much as possible the eventual residual contamination from star-forming outliers. We decided to combine different complementary cuts, as outlined below.
• NUVrJ selection. Photometric passive galaxies are selected using the rest-frame NUV − r) and (r − J) colors following the criterion proposed by Ilbert et al. (2013): (NUV − r) > 3 (r − J) + 1 and (NUV − r) > 3.1. These colors have been demonstrated to be extremely sensitive to reveal objects with recent (1-100 Myr) star formation episodes (blue NUV − r colors) even if they are dust-obscured (redder r − J colors), and are therefore optimized to safely separate quiescent and star-forming galaxies (Arnouts et al. 2007;Ilbert et al. 2015;Davidzon et al. 2017). With this cut, 658 sources are selected. We refer to these as the photometric passive sample.
• Emission-line cut. We further restrict our sample by excluding galaxies with a significant [O ii]λ3727 emission line, a tracer of photoionized gas, which is typically considered an indicator of ongoing star formation 1 . In particular, we exclude those galaxies that have an EW[O ii] > 5Å. The threshold is found to separate well star-forming and passive galaxies in previous spectroscopic surveys at similar redshift (e.g., Mignoli et al. 2009). Combining this and the previous cut, we obtain 485 galaxies that we refer to as the spectrophotometric passive sample.
• Visual inspection. The sample is further refined by visually inspecting all the remaining spectra. We remove galaxies with clearly strong [O ii]λ3727 and/or [O iii]λ5007 lines, obtaining typical S/N < 3 in their EWs. The latter is crucial to spectroscopically characterize galaxies at z 0.65 for which [O ii] is not available in LEGA-C spectra. Based upon these criteria, we obtain a final sample of 350 bona fide passive galaxies.
In Figure 1 we show the distribution of LEGA-C galaxies in two diagnostic diagrams (NUVrJ, Ilbert et al. 2013 Overall, the LEGA-C galaxies present two distinctive peaks in the redshift distribution at z ∼ 0.7 and z ∼ 0.9, with very few galaxies at z > 1. They form two separate populations in the NUVrJ plot, with a blue sequence reaching low (NUV − r) ∼ 1.1 colors and a red cloud that constitutes the photometric passive sample. Although a NUVrJ-only criterion drastically reduces the presence of star-forming systems in the sample ( log sSFR/yr = −11.8), about one-third of objects still have a significant [O ii] emission. Therefore, a spectroscopic selection, here performed by combining a cut on the EW followed by a careful inspection of all the remaining [O ii] and/or [O iii] contributions, is fundamental to ensure the purity of the sample. It is interesting to note that this does not only remove the tail of bluer (NUV − r) galaxies associated with the green-valley region, but also systems with redder colors. The final bona fide passive sample has a median redshift of z = 0.735. Passive galaxies are located toward the high-σ and M tails of the parent distribution. In particular, the median σ (log M /M ) increases from 164.5 km s −1 (10.75) to 205.7 km s −1 (10.95) moving from the parent to the bona fide passive sample, and most of the bona fide passive galaxies (85%) have log M /M > 10.6. With respect to the spectrophotometric, this sample has a SFR lower by 0.16 dex (with a median uncertainty of 0.18 dex). Finally, we note that the passive sample has a median specific star formation rate of log sSFR/yr = −12.1, with only 15 galaxies (∼ 4 %) reaching > −11, a value commonly adopted to classify "passive" galaxies (see Pozzetti et al. 2010).
We measure the composite spectrum of the bona fide passive sample by normalizing each rest-frame spectrum to the median flux between 4200 and 4400Å, and interpolating it onto a common grid (3000-5500Å, ∆λ = 0.35Å pix −1 ), considering only pixels with no problematic spectral flags. For each pixel, a σ-clipping is applied to reject fluxes F i deviating more than 4σ from the mean, minimizing the impact of potential night sky emission-line residuals. The composite spectrum is obtained by computing the median flux and the associated uncertainty, given by the normalized median absolute deviation (NMAD 2 ) divided by the square root  of the number of objects at each pixel. This technique provides a composite spectrum robust against imperfections and not biased toward galaxies with higher S/N. The resulting S/N per pixel measured around 4000Å rest frame is ∼ 230. In Figure 2 we show the median composite spectrum of the 350 passive galaxies selected in this work. The spectrum shows features typical of a passively evolving population. Qualitatively, we note a red continuum, indicative of an old stellar population; a significant 4000Å discontinuity; the absence of a Balmer break (3650Å); a Ca ii K absorption-line deeper than the associated Ca ii H (Section 2.3.1); and the presence of several metallic indices (e.g., G4300, Fe4383, Fe4531, and Mg 2 ). It is very important to notice that even at the very high S/N of the stacked spectrum, no emission line is detectable, confirming the robustness of the selection. On the contrary, the composite spectrum of galaxies excluded by visual inspection has significant [O ii] and [O iii] emission-lines.
In conclusion, the distribution of SED-fitting-derived stellar masses and SFR, as well as the analysis of the composite spectrum, confirms the reliability of our selection of passive galaxies. These systems at 0.6 < z < 1 show no detectable evidence of recent star formation. The presence of a possible underlying young component will be further assessed by studying spectral absorption features (Section 2.4).

Measuring spectral indices with PyLick
In its complete version, the Lick system consists of 25 indices in the rest-frame wavelength interval 4000 − 6000Å (Worthey & Ottaviani 1997;Trager et al. 1998). Each index requires the definition of a central region (λ c1 , λ c2 ) and two other regions located toward the red (λ r1 , λ r2 ) and blue (λ b1 , λ b2 ) of the central one to estimate a reference pseudo-continuum level. Following the approach of the Lick group, the strenghts of atomic I a and molecular I m indices are calculated using the following equations: where F (λ) and F c (λ) are the spectrum flux and the local pseudo-continuum, respectively. The latter is commonly derived through linear interpolation: where λ b,r and F b,r are the central wavelengths and mean fluxes of the lateral blue and red regions. Notes. (a) Expected redshift coverage within 0.6 < z < 1 of the VIMOS HR red spectrograph; (b) Computed between 5 th − 95 th percentiles; (c) Median index value and signal-to-noise ratio. Index units are angström for all indices except: CN1, CN2, Mg1, Mg2 (mag); D4000, Dn4000 (dex).
One of the strongest features in passive galaxy spectra is the 4000Å discontinuity (D4000). At bluer wavelengths, the flux suddenly declines due to the accumulation of a large number of spectral lines that are present in stellar types cooler than G0. This property makes it a good age tracer (e.g., Kauffmann et al. 2003;Moresco et al. 2011). A discontinuity index can be quantified as the ratio of the average flux density in two regions redwards and bluewards of 4000Å: where λ b1 , λ b2 = 3750, 3950Å and λ r1 , λ r2 = 4050, 4250Å in the original definition by Bruzual (1983). Alternatively, Balogh et al. (1999) introduced a narrower definition of the D4000 (D n 4000, estimated over the ranges 3850-3950 and 4000-4100Å), which is optimized to be less sensitive to reddening effects. To perform all the measurements of absorption features, we developed a dedicated Python code named PyLick. It is a public 3 flexible Python library developed for the fast and accurate estimation of the main spectral indices introduced in the literature from the UV to the near-IR. A more detailed description of the main elements of the code, as well as validation tests against available LEGA-C DR2 data, can be found in Appendix A. In this work, we will focus on atomic and molecular (Lick) indices and D4000. Because spectral indices are defined on fixed wavelength intervals, the measurement can be affected by systematics if spectral broadening effects (e.g., instrumental resolution, stellar velocity dispersion) are not properly taken into account. In general, this results in lower index values compared to intrinsic ones. To address this issue, we convolve each rest-frame spectrum (FWHM spec ≈ 1.3Å at z = 0.7) to match the resolution of the models that will be implemented (FWHM mod = 2.5Å), by using a Gaussian kernel with a standard deviation of σ = (FWHM 2 model − FWHM 2 spec ) 1/2 /2.355. After the measurement, we calibrate indices to zero-velocity dispersion following the approach described by Carson & Nichol (2010). The procedure to derive the correction coefficients is described in detail in Huchet et al. (2022, in preparation). Briefly, we measure indices on SSP spectra broadened at different σ generated from the 2016 version of Bruzual & Charlot (2003)  -4000Å discontinuity indices: D4000, D n 4000; -Two recently defined pseudo-Lick indices: Ca ii K, Ca ii H (Section 2.3.1). This dataset extends LEGA-C DR2 public catalog from Hβ to Fe5406 indices and, in particular, Mg ones, commonly used as proxies to study the α enhancement. For each index, we compute redshift coverage, 5th-95th percentile range, median value, and median S/N. These values are presented in Table 2.
Trivially, bluer (redder) indices are available only at higher (lower) redshift. This information, combined with the upper panel of Figure 2, gives an idea of the spectral and redshift coverage of this study. Most of the indices are available only for a narrow rest-frame wavelength range between 3900-4500Å. However, a significant number of galaxies (∼ 200) still share a common wavelength range between 3700-4900Å. In this case, the redshift coverage is reduced to z 0.9. The median S/N of the measured indices is > 10, except for those with a signal ∼ 0 (Balmer and CN indices), or those defined on a narrow central region (Ca4227, Ca4455) 2.3.1. The H/K ratio Rose (1984) firstly proposed to study stellar populations using the ratio of the minimum fluxes of two nearby lines. Specifically, Ca ii H over Ca ii K ratio was found to correlate with the starburst ages in poststarburst galaxies. As a matter of fact, H absorption line (λ = 3970.1Å), which is deeper in the presence of young A-and B-type stars, overlaps to Ca ii H (λ = 3968.5Å), while Ca ii K (λ = 3933.7Å) remains relatively uncontaminated. The nomenclature H/K is thus a compact notation for (Ca ii H + H )/Ca ii K and can be used as a stellar population diagnostic. Even the presence of a fraction of a young ( 1 Gyr) stellar component can significantly alter this spectral feature (Longhetti et al. 1999;Lonoce et al. 2014;Moresco et al. 2018). In the literature, this value has been usually measured as the ratio of minimum fluxes in H and K lines (e.g., Rose 1985;Leonardi & Rose 1996;Longhetti et al. 1999;Lonoce et al. 2014;Moresco et al. 2018): atively independent of changes in spectral broadening, like those due to the instrumental resolution and stellar velocity dispersion. However, measurements of minima can be strongly biased by the presence of noise peaks, especially in low-S/N regimes. For this reason, we adopt here a hybrid approach recently introduced by Fanfani (2019), where H/K is computed as a ratio of two pseudo-Lick indices, using the Ca ii K and Ca ii H index passbands listed in Table 3 that have also been included in PyLick. We measure and σ -correct these indices to derive H/K values and associated uncertainties for the entire parent sample.

Observed spectral features
The distributions of the main measured absorption features for the various subsamples of passive galaxies are presented in Figure 4, and the corresponding median values are quoted in Table 1.
Overall, the bona fide passive galaxies tend to segregate with respect to the parent sample median. We observe typical H/K ratios of 0.96 ± 0.08 and very few systems with H/K > 1.1, while parent sample galaxies reach values well above 1.2. It is important to stress here that even without applying any cut on the H/K ratio, our selection criteria allows us to discard the majority of the tail at high H/K, leaving us a sample with values characteristic of a passive population. This is important and independent evidence supporting the purity of our selection and the negligible contamination from underlying young stellar populations in the passive sample. Another interesting result that we report for the first time is that the Ca ii K line itself shows a clear bimodality, with passive galaxies having Ca ii K > 5Å. This is not the case for Ca ii H, because as mentioned above, the H line strengthens the index in younger populations. We note here that Ca ii K is likely to be one of the main drivers of the D4000 bimodality as it is included in the blue passband of the D4000 index, while the Ca ii H shows a much flatter trend as a function of redshift. The well-known D n 4000 and Hδ bimodalities are already studied in detail in the local universe (Kauffmann et al. 2003;Siudek et al. 2017) and in LEGA-C data (Wu et al. 2018). Photometrically selected massive and passive galaxies already populate high-D n 4000 and low-Hδ A tails, indicating relatively old stellar populations. The addition of a spectroscopic criterion removes a significant number of remaining low-D n 4000 and high-Hδ A galaxies. Among bona fide passive galaxies, only 11 (3%) have D n 4000 < 1.5, and only 28 (8%) have Hδ A > 2.5Å. However, we underscore here that no additional cuts have been applied to minimize selection biases. Interestingly, we observe a slight bimodality also for G4300, which measures the optical CH band (also known as G-band) and is very sensitive to the carbon abundance (Tripicco & Bell 1995;Korn et al. 2005). The bona fide passive galaxies have a relatively high G4300 2.5Å. Finally, they are characterized by high Fe4383, a primary indicator of stellar metallicity, with typical values of Fe4383 ∼ 3.8Å.
We further explore the bona fide passive sample by looking at index-z-σ trends. For this purpose, we firstly divide galaxies into two stellar velocity dispersion bins using σ = 215 km s −1 as the threshold value. Then, we divide each subsample into four or three redshift bins depending on the redshift coverage, considering intervals ∆z ∼ 0.08 − 0.1. All the bins have on average N ∼ 30 objects and in all cases N > 10. Finally, we compute the mean of the redshift and corresponding index in each bin, along with its associated errors. The resulting trends for the main spectral features are presented in Figure 5. We also quantify these trends within individual galaxies by using the nonparametric Spearman rank correlation test.
In general, we observe that Balmer indices correlate moderately with redshift with a Spearman ρ ∼ 0.4 (p- value 10 −8 ). At fixed z, high-σ galaxies have weaker Balmer indices. Similar behavior is observed for the Ca ii K index and -with an opposite trend -for the H/K ratio. D n 4000 anticorrelates moderately with redshift with a Spearman ρ ∼ −0.3 (p-value 10 −5 ). At fixed z, there is a clear separation between the two σ regimes. Note that these relations could be (and have been) used in the cosmic chronometers framework to constrain the expansion history of the Universe, once the D4000-age relation is carefully calibrated (Moresco et al. 2011). Temporarily ignoring metallicity effects that will be discussed in the following, these observed trends are fully consistent with the mass-downsizing scenario, i.e. more massive galaxies formed earlier and in short timescales, and then experienced a passive evolution. Further support to this idea is given by iron indices, which are stronger for high-σ galaxies but do not show significant evolution over ∼ 2 Gyr of cosmic time. This finding supports the scenario in which this population has already exhausted its gas reservoir, hence not being able to significantly evolve their metallic content, and is evolving passively as a function of cosmic time.
The absence of a correlation in redshift for C 2 4668 and G4300, instead, shows that these indices cannot be used as age indicators throughout cosmic time. However, the segregation in σ is still consistent with the idea that they are sensitive to stellar population age in the very first Gyr after the formation. Another explanation is that the segregation is due to different stellar metallicities and abundances. Even in that case, they could be crucial to determine galaxy ages indirectly by breaking the age-metallicity degeneracy.
It is also interesting to notice that Fe4383 shows a very flat behavior as a function of redshift, for both σ bins. This trend, confirmed also by the analysis of the stellar metallicity presented below, supports the scenario for which these systems formed the majority of their mass (and hence metallicity) at high redshifts and in very concentrated episodes of star formation, exhausting their gas reservoirs and therefore not being able to further change their metal content.
Qualitatively similar trends are obtained also maximizing the number of galaxies per bin instead of using bins of constant ∆z. Given the inhomogeneous redshift coverage of LEGA-C, this latter method helps to strengthen bin statistics but has the cost of losing leverage in redshift.

Morphology
We analyze the morphological content of the bona fide passive sample using the Zurich Estimator of Structural Types (ZEST) classification (Scarlata et al. 2007), based on principal component analysis of the surface brightness profiles. We find that the great majority (71%) of galaxies are classified as E/S0, 27% as intermediate (where the contribution of the bulge is similar to that of the disk), 2% (six galaxies) as irregular, and none show a late-type morphology. Among the six galaxies classified as irregular, only two appear in the final sample of constrained galaxies. Their morphology is not significantly disturbed and their spectra are typical of old stellar pop-ulations, therefore we still include them in the bona fide passive sample and verify that their exclusion does not affect our results. Within the bona fide passive sample, we observe no significant correlation between different morphological classes and σ . Interestingly, a similar percentage of E/S0 types (72%) has also been found by Moresco et al. (2013) in their sample of ∼ 17000 zCOS-MOS galaxies. The two works share 127 galaxies (37% of the bona fide passive sample) and adopt the same morphological catalog. However, selection criteria are different because the authors make use of the best-fit SED matching a local E-S0 template and different color cuts, along with spectroscopic cuts based on both [O ii] and Hα emission.
The presence of passive systems with non-purely earlytype morphologies has been already discussed in the literature (e.g., Dressler et al. 1999;Pozzetti et al. 2010). This result can be explained by considering the existence of a class of objects for which the morphological transformation lasts longer than changes in stellar population, i.e. galaxy colors redden before the galaxy reaches an early-type morphology. It is therefore improper to treat galaxies without an early-type morphology as contaminants.
In conclusion, physical properties derived from SED fitting ( Figure 1) and observed spectral properties of the three different subsamples considered (Figures 2, 4, and 5), confirm the existence of a population of passive systems characterized by lower/absent star formation, older stellar populations, and higher metallicities with respect to the parent galaxy population, and consistent with the mass-downsizing scenario. These observational data also support that the bona fide selection adopted is able to maximize the purity of the sample, providing a sample of massive and passive galaxies, with a negligible (if any) contamination by star-forming outliers. The analysis of each individual galaxy will add more granularity to this picture.

Analysis
With this work, we are exploring the capability of studying cosmic chronometers with Lick indices, which allow us to compare with several literature studies (e.g., Thomas et al. 2010;Gallazzi et al. 2014;Onodera et al. 2015). The stellar ages derived with this method are light weighted and can be biased toward lower values because young stars outshine the older ones (Conroy 2013), while metallicity and abundance are found to be less affected (Serra & Trager 2007). When a young stellar population is present, mass weighted age -which can be derived with full spectral fitting codes -can provide a better estimate of the integrated galaxy SFH. From the analysis of the H/K ratio (Sect. 2.3.1) we can exclude a contamination from young (200 Myr-1 Gyr) stars in our bona fide passive galaxies, therefore also the difference between light-and mass-weighted is expected to be negligible.
To more quantitatively assess the robustness for the CC method, in Kang et al. (2022, in preparation) we perform a full spectral fitting analysis using the publicly available BAGPIPES code (Carnall et al. 2018) to determine the ages, metallicities, and SFHs of the bona fide galaxies selected in this paper. While we refer to that paper for the detailed description of the methods adopted and results obtained, we anticipate here that the slope of the age − z relation is very robust and in perfect agreement between our two studies, with a percentage difference 5% ( 0.2σ).

Stellar population model
In this work, we adopt the Thomas, Maraston, & Johansson (2011) Maraston (2005) and element response functions from Korn et al. (2005), and are carefully calibrated with galactic globular cluster data. The main ingredients are single-burst SFH, Salpeter (1955) initial mass function, MILES empirical stellar libraries (Sánchez-Blázquez et al. 2006), andCassisi et al. (1997) stellar evolutionary tracks. In this work, we use the models provided at a MILES resolution of 2.5Å (Beifiori et al. 2011).
The original grid spans the following parameter space: 0.1 < age/Gyr < 15, −2.25 < [Z/H] < 0.67, and −0.3 < [α/Fe] < 0.5 with a total of 480 grid points, each one corresponding to the prediction for a single SSP. We perform a three-dimensional linear interpolation of the grid to reach a resolution of ∆age = 0.1 Gyr, and ∆[Z/H] = ∆[α/Fe] = 0.01 dex. This process allows us to achieve higher precision in parameter estimation. We verified that our approach does not introduce any systematic bias, since by considering different grid choices the final results are always fully compatible with each other within 1σ.
These models assume an instantaneous burst of star formation. While this assumption might in principle lead to a significant underestimation of the global stellar age in mixed populations, we note here that our selec-tion criteria were chosen to obtain a sample with a very concentrated SFH and to minimize the contamination from a significant residual star formation, as confirmed by the analysis of various indicators discussed in Section 2. In particular, for this population we expect the SFHs to be extremely coeval, with very small durations (τ 0.3 Gyr, if modeled with a delayed exponential SFH), as confirmed from a parallel analysis (Borghi et al. 2022).

MCMC Analysis
To compare the measured absorption features to TMJ11 models, we develop a fully Bayesian analysis pipeline. We assume that the uncertainties on indices are well-determined, Gaussianly distributed, and independent. A set of modeled indices, which are a function of parameters θ = (age, [Z/H], [α/Fe]), can therefore be fitted to the observed ones using the log-likelihood function where k is a constant, I mod i (θ) the model prediction for the ith observed index I i , and σ i its uncertainty. The posterior probability distributions of θ are explored using the affine-invariant ensemble sampler emcee We use flat priors that span the entire parameter space allowed from the models. An important point that we stress here is that we do not assume any cosmological prior for galaxy ages. This is a crucial point, and a difference with respect to other similar works, to avoid introducing cosmological biases in the age determination and keep the results cosmological-independent.
After the analysis, we carefully assess the convergence of each chain. A detailed description is given in Appendix B.

Index combination
The choice of the index set to analyze must be carefully addressed because, given the number of the measured indices, there are more than one million possible combinations. The relative sensitivity of different indices to different stellar population parameters and abundances is not identical (Tripicco & Bell 1995;Korn et al. 2005;Lee et al. 2009). Balmer and D n 4000 indices are better suited to constrain ages, Fe-dominated indices to measure the Fe abundance and total stellar metallicity, while Mg indices to estimate the α elements abundance. It is therefore intuitive that different index sets provide different constraints. Because the ultimate goal of this work is to avoid any effect that could potentially introduce biases in the age − z relation, we choose to fit galaxies considering always the same set of indices. This means, on one hand, that using indices covering a wide redshift range will significantly reduce the number of galaxies analyzed; on the other hand, to maximize the number of analyzable galaxies the set of indices has to be chosen to span a small wavelength range (from 3600 to 4900Å; see the upper panel of Figure 2 and Table 2). The tradeoff between these two aspects combined has the cost of reducing by about one-third the number of galaxies measured but guarantees the crucial advantage of providing a homogeneous analysis.
For these reasons, we decide to use the following set of Lick indices: Hδ A , CN 1 , CN 2 , Ca4227, G4300, Hγ A , Hγ F , Fe4383, Fe4531, and C 2 4668. They are chosen among those that are calibrated against globular cluster data in TMJ11, but excluding those redder than Hβ, because, as mentioned above, they would not allow us to obtain a statistically meaningful sample. We also exclude Hβ because it can be biased by the presence of a residual emission component (e.g., Concas et al. 2017). The final set spans a narrow wavelength range in the optical regime, from 4000 to 4800Å. We verified on a few available galaxies that the inclusion of Mg b does not significantly change the results. We also performed extensive tests with very different sets of indices. A detailed discussion is presented in Appendix C. Given the available data, we find that this is the optimal set, because it maximizes both the spectral coverage and the number of galaxies for which we obtain constraints, and it also provides a good balance between age-, metallicity-, and α-sensitive indices.
In the end, we obtain robust constraints for 140 galaxies over 199 that have been analyzed (Table 4)

Results and discussion
In this section, we present the results obtained from the analysis of the previously defined H/K ratio and for the physical properties of selected passive galaxies in the LEGA-C DR2 survey. We start by discussing the H/K ratio and the correlation with commonly used diagnostics in Section 4.1. In Section 4.2 we explore scaling relations of stellar population parameters versus σ and M . In Section 4.3 we discuss trends with redshift, with a focus on the age-redshift relation. Finally, in Section 4.4 we present the median binned parameters-redshift relations.
We underline here below the main points of our analysis to be kept in mind when comparing our results with those of previous analyses in the same field.
(i) Our galaxies are passive. Similar studies are mostly targeting morphological early-type galaxies. These samples may contain galaxies with a nonnegligible level of star-formation (∼ 20% at the current stellar masses; Moresco et al. 2013); (ii) We do not use cosmological priors. We also verified that, especially for low-S/N galaxies, the use of a cosmological upper limit for the galaxy ages may produce an apparent convergence of Markov Chain Monte Carlo (MCMC) chains at higher ages toward the prior, introducing a bias in the sample (Appendix B); (iii) We adopt single-burst SFHs. In this case, the time of the onset of star-formation t form coincides with the time of the main (and only) star-formation event t peak , and with the time of quenching t quench . A proper comparison with other data sets where an extended SFH is assumed should carefully take into account how ages are defined and intrinsic degeneracies between SFH parameters. For instance, by assuming an SFR(t) ∝ exp((t − age)/τ ) (tau model), a positive correlation between the timescale τ and the galaxy age (time since t form ) can be present, with values of ∆τ ∼ 0.3 ∆age (Borghi et al. 2022).
Before applying the cosmic chronometer method, points (i) and (ii) should be carefully considered, but it is important to stress that this method relies on differential -not absolute -galaxy ages. Therefore, by assuming a more extended SFH for the entire population of these galaxies (i.e. a vertical offset in the age − z plane), the final H(z) measurement is not affected (see Section 3.1).

The H/K ratio as a stellar population diagnostic
In Figure 6 we show the distribution of the entire parent sample in four widely used diagnostic diagrams, namely NUVrJ, UVJ, EW[O ii]-D n 4000, and SFR-M , color-coded by the H/K ratio. To better capture the mean trends of H/K, we perform locally weighted regression (LOESS). We use the LOESS package of Cappellari et al. (2013a) based on the two-dimensional algorithm of Cleveland & Devlin (1988), with a linear local approximation and a regularization factor f = 0.5. Selected bona fide passive galaxies are highlighted with black borders. To quantify these trends, we use Spearman correlation coefficients.
The diagrams show a clear and significant correlation between the H/K ratio and other diagnostic tools despite its small dynamic range (0.8 H/K 2, and a median error of ∼ 0.14). In more detail, we observe a strong correlation with (NUV-r) (ρ = −0.72; p-value = 10 −164 ) and with (U − V ) (ρ = −0.72; p-value = 10 −178 ). The correlation with (r − J) and (V − J) colors is weaker but still significant (ρ = −0.25; p-value = 10 −19 and 10 −15 , respectively). Interestingly, we find that a selection based on the threshold value H/K < 1.1 can reproduce the NUVrJ selection with 17% incompleteness and 19% contamination.
As shown in the third panel, an H/K cut does not exclude galaxies with a significant [O ii] emission. For instance, 19% of the plotted galaxies have H/K < 1 and EW[O ii] > 5Å. Unfortunately, the lack of other spectral features as Hα, [N ii], and [S ii], due to the limited wavelength coverage of the current data set, does not allow us to investigate the nature of these sources. For these systems, the combination of multiple indicators is still needed to obtain a pure sample of passive galaxies.
A strong correlation is also observed for the sSFR (ρ = 0.70; p-value = 10 −161 ). We find that a threshold on H/K values of H/K < 1.1 can reproduce a . Each panel shows the parent sample galaxies extracted from LEGA-C DR2 with a S/N of the H/K ratio higher than 3. Black borders identify bona fide passive galaxies. Dashed lines are the criteria to separate star-forming from passive galaxies taken from the literature (Ilbert et al. 2013;Williams et al. 2009;Mignoli et al. 2009;Spilker et al. 2018, respectively). We use black colors for the criteria also adopted in this work, while gray lines are for illustrative purposes only.
log sSFR/yr < −11 cut with 15% incompleteness and 16% contamination. These are remarkable results if we compare the data needed for the two different selections, and the range of wavelengths spanned. On one hand, a wide photometric coverage is needed for a reliable estimate of a NUVrJ diagram, SFR, or M (typically from the UV to the near-IR), with an accuracy increasing with the number of available photometric points; it is therefore not always available in many surveys. On the other hand, we have a feature defined over a window of only about 150Å for which deep rest-frame optical spectroscopy is needed. The H/K can therefore play a key role in the selection of pure samples of passive galaxies in future wide-field grism surveys such as Euclid (Laureijs et al. 2011) and the Roman Space Telescope (Spergel et al. 2015). Another advantage of this diagnostic is the mild dependence on spectral resolution. Differences between 0.0 0.5  Figure 7. Distribution of mean stellar ages, metallicities, and α enhancements as a function of stellar velocity dispersion for individual massive and passive galaxies selected in LEGA-C DR2 (violet points). Gray contours represent early-type galaxies in the local Universe from SDSS/MOSES (Thomas et al. 2010), and enclose 1σ, 2σ, and 5σ regions. Dashed lines and shaded regions are robust linear fits and associated 2σ scatter regions, for 2.1 < log σ < 2.5. The resulting slopes and their uncertainties are annotated in the bottom left. Black arrows represent the age evolution expected for a passive SSP.
H/K values obtained on 8 and 2.5Å FWHM spectra are 4%. Performing the same analysis on individual Ca ii K and Ca ii H indices yields ∼ 10% systematic differences.
Concerning the 140 bona fide passive galaxies, we note here that no correlation between stellar population parameters (especially age) and H/K is present. This result is expected given the fact that no significant contribution from a young stellar component is present, i.e. the H line deepening effect becomes negligible in the passive regime.
In summary, young (∼ 200 Myr) stellar populations whose light is predominantly due to A-and B-type stars (H/K > 1.1) are characterized by higher UV fluxes, lower D4000, a higher EW[O ii], and dominate the star formation main sequence of LEGA-C galaxies. The H/K lowers when approaching the quiescence criteria, but the study of timescales and the interplay with stellar population parameters for the whole population of galaxies will require a further assessment. Figure 7 shows stellar population parameters as a function of the observed stellar velocity dispersion for passive galaxies at z ∼ 0.7. As a local control sample, we consider the SDSS/MOSES data set of Thomas et al. (2010) of morphologically selected early-type galaxies (ETG) at z ∼ 0.05, since similar models and analysis techniques are adopted. Their sample is ∼ 20 times larger than ours and spans a wider range in log σ (1.7-2.5). For a fair comparison, the local sample is limited to comparable stellar velocity dispersion values, i.e. 2.1 < log σ [km s −1 ] < 2.5. To this catalog we add galaxy stellar-mass estimates from MPA-JHU DR 8 (Kauffmann et al. 2003). To study log age, [Z/H], and [α/Fe] versus log σ and log M * we perform robust linear regression with the least trimmed squares (LTS) algorithm (Rousseeuw 1984) and measure their Spearman coefficients. Results are quoted in Table 5.

Physical parameters versus σ and M
The general trends in σ are consistent with a stellar population that experienced a passive evolution from z ∼ 0.7 to z ∼ 0.05. In our z ∼ 0.7 sample we find shallower relations between the stellar population parameters and σ . This is expected from the lower statistics and the different selection criteria adopted. Moreover, the dynamic range available in age decreases with increasing redshift because the Universe gets younger. We first consider SSP-equivalent stellar ages. Remarkably, the difference of 5.5 Gyr between the two samples is perfectly in agreement with the age evolution of the Universe, confirming that the stellar populations of this population of galaxies experienced a pure passive evolution within the assumed reference cosmology. Contamination from young stellar populations in the high-z (lowz) sample would produce a larger (smaller) offset. We find mild correlations between log age vs. log σ (with a slope of 0.5±0.1) and log age vs. log M (with a slope of 0.19±0.04). To facilitate the comparison between different works, we convert galaxy ages to formation redshifts z form , and study the relation with M , obtaining z form = (0.40 ± 0.05) log 10 M 10 11 M + (1.46 ± 0.02) , with an intrinsic scatter of ∼ 0.24. This means that galaxies with higher stellar mass (log M /M = 11.3) formed their stars at z form ∼ 1.6, while less massive ones (log M /M = 10.7) formed their stars at z form ∼ 1.3. Interestingly, the formation epoch of this population of passive galaxies is found to take place right after the peak of the cosmic star formation rate density (z ∼ 2; Madau & Dickinson 2014). We also find few (22, ∼ 16% of the passive sample) very massive (log M /M > 11) galaxies with a z form > 2.5 up to 5.
In Figure 8 our results are compared with the existing literature where similar analysis methods are used. We find very good agreement with the work of Jørgensen & Chiboucas (2013), who studied spectra of ∼ 80 cluster galaxies at z = 0.5−0.9 comparing observed Lick indices with TMJ11 models. They found formation redshifts of z form ≈ 1.24 and 1.95 for stellar masses of log M /M ≈ 10.6 and 11.4, respectively. We also find an excellent agreement with the work of Choi et al. (2014), who analyzed stacked spectra of sSFR-selected passive galaxies at comparable redshifts and masses with the full spectral fitting method. By assuming single-burst SFHs, the authors found typical formation epochs of z form ∼ 1.5. Finally, Gallazzi et al. (2014) analyzed ages and stellar metallicities for ∼ 70 between star-forming and quiescent galaxies at z ∼ 0.7 using age-sensitive D n 4000, Hβ, and Hδ A +Hγ A , and metal-sensitive [Mg 2 Fe] and [MgFe] indices. Rewriting Equation 8 in terms of the formation time t form versus log M , we obtain a trend of −1.26 ± 0.27 Gyr per decade in mass. Remarkably, this is in excellent agreement with the trend observed by Carnall et al. (2019) of −1.48 ± 0.37 Gyr per decade in mass, using a very different approach (full spectral fitting) and assuming more extended (double power-law) SFHs.
We now move to the analysis of mean stellar metallicities. Our results show no significant evolution in [Z/H] for passive galaxies since z ∼ 0.7, with a median offset of 0.05 dex comparable to the median uncertainty. This confirms and statistically strengthens earlier results from Gallazzi et al. (2014) at similar redshifts, and it confirms several works that found similar metallicities up to z ≈ 2 (Onodera et al. 2012(Onodera et al. , 2015Citro et al. 2016;Estrada-Carpenter et al. 2019). On the other hand, recent studies report a significant evolution of ∆[Z/H] > 0.1 dex. An example is the work by Beverage et al. (2021) based on a sample of 65 LEGA-C quiescent galaxies analyzed with a full spectral fitting code. The authors found no evolution in [Mg/Fe] values, but a ∆[Fe/H] (hence ∆[Z/H]) of about 0.2 dex with respect to local log(M /M ) = 11 quiescent stacks. We also study [Z/H] versus log σ and log M , finding no significant correlations. Again, this can be attributed to our stricter selection criteria of the most passive systems, which are known to have shallower metallicity-mass relations (see Peng et al. 2015;Gallazzi et al. 2014Gallazzi et al. , 2021. Indeed, when dividing them into two σ bins, we find that the typical [Z/H] of σ > 215 km s −1 systems is 0.1 dex higher than low-σ ones, consistently with the downsizing scenario. Finally, we analyze the mean stellar α enhancements. The [α/Fe] values of our galaxies are ∼ 0.1 dex lower  . Left panels: distribution of single-burst stellar ages, metallicities, and α-enhancements as a function of redshift for 140 bona fide passive galaxies selected in LEGA-C DR2 (violet points). The error bars are obtained as the 16 th and 84 th percentiles of the marginalized posterior distributions. In the age-redshift panel, we shade the parameter space not allowed for a general cosmology (gray solid), as well as the formation redshift assuming a pure passive evolution (dotted lines). Right panels: median binned relations obtained by dividing the sample into two σ regimes, with σ = 215 km s −1 as a threshold. Each bin contains 5-40 objects. Error bars on the y-axis are errors associated to the median values, while those on the x-axis indicate the bin widths. than the local sample. However, as discussed in Appendix C, this could be completely explained by the fact that we could not use Mg indices in our baseline analysis. For the first time, we identify a positive trend between [α/Fe] and log σ with a slope of (0.2 ± 0.1) for a large population of individual passive galaxies at z ∼ 0.7. This confirms the trends observed with stacks of quiescent galaxies at 0.1 < z < 0.7 by Choi et al. (2014). We do not find a significant trend in log M . On one hand, local early-type galaxies show stronger correlations with the gravitational potential well (of which σ is a tracer) than with M (e.g., Thomas et al. 2005;Barone et al. 2018). On the other hand, uncertainties in the M derivation may play a role in washing out these relations.
In conclusion, the sample of selected passive galaxies at z ∼ 0.7 shows trends in age, [Z/H], and [α/Fe] versus σ in agreement with those expected from a passively evolving population. The age offset, as well as the lack of a significant offset between the typical values of [Z/H] and [α/Fe] is evidence that these systems should have formed their stars on short timescales, depleting the great majority of their gas reservoirs, and experienced passive evolution since then. Figure 9 shows the derived stellar population parameters as a function of redshift. Interestingly, despite not having imposed any cosmological prior on the age of the galaxies, our derived ages are in all cases in agreement with a generic cosmological model, never exceeding the age of the Universe at any redshift. Moreover, the upper envelope of the distribution follows the decrease expected from the aging of the Universe. At all red-shifts, only a few passive galaxies have ages 2 Gyr owing to the stringent selection criteria. The median value is age = 3.01 Gyr with a ±0.97 Gyr 1σ scatter. This implies a formation time of t form ∼ 4 Gyr after the Big Bang, corresponding to a formation redshift of z form ∼ 1.5 as previously discussed (Section 4.2). Stellar metallicities have solar or slightly supersolar values, [Z/H] = 0.08 dex with a ±0.18 1σ scatter, spanning a very narrow range if compared to the initial parameter space (−2.25 < [Z/H] < 0.67). Differently from stellar ages, they show no sign of evolution even within the redshift range explored in this work.

Physical parameters versus z
We find slightly supersolar [α/Fe] values, [α/Fe] = 0.13 dex with a ±0.11 1σ scatter. In particular, 124 (89%) galaxies have [α/Fe] > 0, pointing to very short formation timescales, i.e. before Type Ia supernova explosions can pollute the interstellar medium with a relatively high amount of iron-peak elements. As for the metallicities, these star formation timescales indicators do not significantly evolve over the redshift probed.
Overall, these results confirm the passive evolution within the ∼ 1.6 Gyr interval of cosmic time explored in this work shedding light on the granularity of the physical properties and SFHs. The uniformly small scatter in [Z/H] and [α/Fe], 0.20 dex at fixed σ or z, confirms the large homogeneity of the sample and puts strong constraints on the duration of the chemical assembly of these systems.

Median binned relations
To provide a comprehensive picture with the results discussed in Section 4.2 and 4.3, we bin age, [Z/H], and [α/Fe] in σ and in z. Binning in σ instead of M has the benefit of avoiding model-dependent effects introduced by SED-fit modeling, other than using an observational quantity. The galaxies were firstly binned into two σ using σ = 215 km s −1 (approximately equivalent to log M /M ≈ 11) as a threshold, then into four equally spaced redshift bins, with ∆z = 0.075 from z = 0.6 to z = 0.9. Bins have from ∼ 5, at higher z, to ∼ 40, at lower z, objects. To each bin, we assign a mean z value and a median (age, [Z/H], [α/Fe]) with associated uncertainty. Results are shown in the right panels of Figure 9.
Clearly, the median properties of the analyzed passive galaxies follow a downsizing pattern. At each cosmic epoch, stellar populations hosted in galaxies with higher mass are older, more metal rich and alpha enhanced. This suggests that their formation occurred at earlier times, with a difference of ∆age ≈ 0.5 Gyr, and on shorter time scales with respect to less massive ones. We note that these trends were already qualita-tively confirmed from the analysis of the main absorption indices (Section 2.4).
Last but not least, it is remarkable that for each σ regime we find a clear, almost parallel, age-redshift relation. The study of their differential evolution will allow us to perform cosmological studies using the cosmic chronometer approach (Borghi et al. 2022).

Conclusions
In this work, we take advantage of the public Data Release 2 of the LEGA-C spectroscopic survey to place constraints on the stellar population properties of individual massive and passive galaxies at 0.6 < z < 0.9. Based on a robust spectral analysis of Lick indices, our aim is to characterize this population and to explore the reliability of using these galaxies as cosmic chronometers. Our main results are summarized below.
1. We select a pure sample of 350 passive galaxies at z ∼ 0.7 combining a photometric NUVrJ selection, a spectroscopic EW[O ii] cut, and a careful visual inspection of individual spectra to further remove galaxies with significant emission-lines (Figure 1). As confirmed by the stacked spectrum (Figure 2), no underlying emission line components are present in the sample, confirming its high purity. Selected passive galaxies have a median observed velocity dispersion of σ = 206 km s −1 , stellar mass of log M /M = 10.95, and very low specific star formation rate log sSFR/yr = −12.1. Most of them have an early-type morphology, but there is also a nonnegligible percentage of systems (about one-third) with an intermediate morphology.
2. We develop, validate, and publicly release PyLick, a flexible Python tool to measure absorption features, implementing several different index definitions (see Appendix A). This allows us to measure spectral indices over a wide wavelength range in LEGA-C data, extending the current public catalog of Lick indices by Straatman et al. (2018) and enabling a more detailed exploration of the dependence of our results on different index combinations.
3. We introduce the H/K ratio, a new diagnostic feature defined as the ratio of pseudo-Lick indices Ca ii K and Ca ii H (Figure 3). We verify that it is an excellent tracer of potential contamination of the sample due to star-forming or young populations, confirming that our sample is compatible with no or negligible contamination, with H/K = 0.96 ± 0.08 (1σ). Moreover, a selection based on H/K < 1.1 is found to reproduce a NUVrJ selection (Ilbert et al. 2013) or a log sSFR/yr < −11 cut ( Figure 6) with a small percentage of incompleteness (∼ 15%) or contamination (∼ 15%) while requiring a much narrower wavelength range.
4. Using an optimized combination of Lick indices (namely, Hδ A , CN 1 , CN 2 , Ca4227, G4300, Hγ A , Hγ F , Fe4383, Fe4531, and C 2 4668), we measure single-burst stellar age, [Z/H], and [α/Fe] for 140 passive galaxies, without assuming cosmological priors on the maximum value of age as a function of redshift. We also performed an extended analysis to assess the impact of different choices of indices, verifying that our findings are robust against the choice of a different combination of indices (Appendix C).
5. We find trends between log age, [Z/H], [α/Fe], and the stellar velocity dispersion consistent with those expected from a passively evolving population, with slopes of (0.5 ± 0.1), (0.3 ± 0.2), and (0.2 ± 0.1), respectively (Figure 7). This analysis shows, for the first time using individual galaxies, that a relation between [α/Fe] (a star formation timescale proxy) and σ is already in place at z ∼ 0.7. Moreover, the age difference of 5.5 Gyr between our sample and local ETGs can be entirely accounted for by a pure passive evolution. Assuming a standard ΛCDM cosmology, the relation between formation redshifts and galaxy stellar masses is found to agree with several previous analyses (Figure 8), confirming that this population of massive galaxies forms at z form ∼ 1.3(1.6) at masses log M /M = 10.7(11.3), after the peak of the cosmic star formation rate density.
6. Even if we do not impose any cosmological prior to the age of the population, the obtained ageredshift evolution is consistent with a ΛCDM universe ( Figure 9). 7. Finally, the analysis of median binned relations confirms the downsizing scenario and the passive nature of this population. Remarkably, we obtain two clear nearly parallel age − z relations for both the higher (σ ≈ 230 km s −1 ) and the lower (σ ≈ 200 km s −1 ) mass regimes. This difference of ∆age ≈ 0.5 Gyr can be interpreted as a delay in formation time between the two, with later formation epochs for the population of less massive galaxies.
Overall, our analysis of individual galaxies confirms the existence of a population of passively evolving galaxies at intermediate redshift that follows a downsizing pattern. In a subsequent paper, we will make use of their median age − z relation to derive constraints on cosmological parameters and obtain a direct measure of the Hubble parameter H(z) using the cosmic chronometers approach. In this context, we also plan to study in more detail the effect of assuming different SFHs. In this way, we will be able to study for the first time at these redshifts the detailed stellar population properties of passive galaxies and their underlying cosmology, jointly.
In the more distant future, dedicated large spectroscopic surveys would be crucial to extend the present analysis, providing at the same time unique opportunities for gravitational wave astronomy and cosmology (Palmese et al. 2019).   Figure A2. Examples of contour plots that may generate unreliable (red) and reliable (blue) results. Axes cover the full parameter space allowed from the models. Contours enclose 1 and 2σ confidence regions. and outliers are mostly galaxies with lower-quality spectra. There is no trend between relative differences and the indices values, and the distribution is qualitatively Gaussian.
We find good agreement also for the uncertainties, with a typical η(σ) of ∼ 10 −2 (∼ 10 −1 1σ scatter). Larger deviations are seen for indices with values ∼ 0 (Balmer indices, CN 1 ), for which measured uncertainties are ∼ 10% lower. This difference can be due to different methods to estimate formal errors.
We note that at the current resolution (R ∼ 3500), the method used to interpolate the spectra can introduce higher discrepancies than those observed before. In particular, using 0 th order interpolation the typical scatter in η(I) increases up to ∼ 10 −2 . Overall, these results confirm the reliability of PyLick to measure indices values and formal errors from observed spectra.

Appendix B. Assessing convergence and reliability of MCMC posterior distributions
In a Bayesian analysis, it is crucial to determine whether MCMC chains are reproducing with sufficient accuracy the target posterior distribution. However, there is not an established standard to assess the convergence (Hogg & Foreman-Mackey 2018;Roy 2020). A possibility is to take into account the autocorrelation time τ int analysis, where τ int quantifies how many steps are needed to generate independent samples. We consider a chain to be formally converged when τ int for each parameter is greater than 1/100th of the chain size. In this work, analyzed galaxies typically require ∼ 7000 steps. However, sample averages derived from formally converged chains could still be unreliable. This is the case when posterior distributions: 1. are skewed toward the priors (in this work, the limits of the parameter space allowed from the models); 2. are not predictive (mainly due to high agemetallicity degeneracy); 3. are multimodal.
In Figure A2 we show contour plots representative for the three categories and a typical contour plot for a good fit. We note that the age-metallicity degeneracy follows the so-called "3/2 rule" (Worthey 1998), where an increase (decrease) of log age by a factor of 2, when accompanied by a decrease (increase) of stellar metallicity [Z/H] by a factor of 3, can reproduce the same set of observed indices. Multimodal distributions are obtained for less than 4% of the analyzed galaxies and follow the direction of the age-metallicity degeneracy. As they do not recur for the same galaxy when slightly different sets of indices are considered, they should be considered as a special case of (2) and to be due to intrinsic degeneracies rather than a real complexity in the stellar population.
We check joint and marginal distributions of all the 199 analyzed galaxies, and we flag and exclude galaxies belonging to these three categories.
In Figure B1 we show the distributions of the main parameters before and after this process. The great majority (∼ 85%) of excluded galaxies have overall spectral S/N < 25. However, a cut of this kind applied a priori would have halved the final sample, excluding also good converged fit. Low-σ galaxies are preferentially excluded because they have relatively lower S/Ns. Distributions of stellar population parameters do not change significantly before and after this process. The posterior distribution of ages for galaxies with age > 8 Gyr tend to be heavily skewed toward the 15 Gyr prior and are also characterized by low metallicities ([Z/H] < −0.4) and relatively high α/Fe values.
We observe that S/Ns of individual indices are a good indicator to separate included and excluded galaxies. For this purpose, we use indices with relatively higher S/N (see Table 2), namely, G4300, Fe4383, Fe4531, and C 2 4668. The great majority (∼ 90%) of excluded galax- ies have at least one index with S/N < 10. A cut of this kind would still exclude ∼ 29% of galaxies with reliable constraints. Anyway, we note that these cuts are not universal because they vary for different index combinations, and individual S/Ns of spectral indices vary for different spectral resolutions.
In conclusion, we find that the inspection of MCMC posterior distributions is an efficient procedure to detect unreliable constraints-mostly due to low indices S/Nwhile maximizing the number of galaxies for which we obtain reliable constraints.

Appendix C. Different sets of indices
In this section we illustrate the results obtained with different index sets. We explore the huge number of viable combinations (∼ 1 million) following three approaches: • Maximize the information to be fitted. Using a higher number of indices should provide more stable results. However, indices should be calibrated, and able to disentangle degeneracies giving equal weight to each model parameter.
• Use already-proposed index combinations. In particular we include redder Mg and Fe indices as done by previous works.
• Use a small, essential set to break the existing degeneracy between parameters. Spectral indices are sensitive to variations in age, [Z/H], [α/Fe], but the relative sensitivity to these parameters is not identical. One can choose a small combination of four to six indices based upon their different sensitivity (e.g., for element abundances see Tripicco & Bell 1995;Korn et al. 2005;Lee et al. 2009).
As discussed in the text, we find an optimal combination that maximizes the spectral coverage, the number of constrained galaxies, and the precision of the constraints: Hδ A , CN 1 , CN 2 , Ca4227, G4300, Hγ A , Hγ F , Fe4383, Fe4531, and C 2 4668 (hereafter baseline). In Table 6, we report the other main sets of indices that we analyzed, and the differences in the derived parameters with respect to the baseline combination. We also analyzed many other sets (∼ 50), but they do not add significant information to this study. Notes. For each combination we report the number of constrained galaxies, along with simple and 1σ differences in the derived parameters with respect to the baseline combination.
We find no significant systematic differences in the derived parameters when small changes to the baseline combination are applied, i.e. by adding or removing one to two indices. In particular, we focus here on the removal of CN indices, as the nitrogen abundance is not a free parameter in TMJ11 models (Combo 1), and on the removal of those indices that sample twice the same spectral region (Combo 2). In the first case, we obtain constraints for fewer galaxies with respect to the baseline combination, but with an overall excellent agreement. In the second case, we constrain about the same number of galaxies, obtaining lower ages but still in agreement with the baseline set. This is likely due to the removal of Hγ F that reduces the weight of age-sensitive features, therefore producing less reliable age estimates.
Index combinations discussed above lack redder indices, such as the Mg indices, traditionally used as αabundance indicators. Therefore, we repeat the analysis including Mg b (Combo 3). Amongst the 59 with such relatively large spectral coverage, we obtain constraints for 39 galaxies. Differences are shown in greater detail in Figure  be underestimated by ∼ 0.1 dex. However, given the small statistical significance, we do not correct for this offset. A similar discussion can be done by extending the analysis at all the redder indices (Combo 4, which is also the same combination used in Onodera et al. 2015). Finally, we note here that although minimal sets of N = 4 indices in the wavelength region between 4000 and 4600Å allow us to analyze a large number of galaxies (∼ 300), we do not find a relevant set to place constraints on more than one-third of them. This situation is improved when N = 5 − 7 (e.g., Combo 5), but results show an overall stronger age-metallicity degener-acy with respect to N ≥ 8. It is also interesting to note that even if we do not include Balmer indices (Combo 6), we still obtain constraints for more than 100 galaxies. But we stress again that results are less reliable because we removed most of the age-sensitive features.
In summary, we find that a blind choice of index combinations can lead to less-robust results. This happens when a combination is unbalanced toward one or more parameters of the fit, but also if indices are measured on spectra where the sky subtraction was imperfect. After an extensive analysis, we demonstrate that within the limited statistics and wavelength coverage of current data, results do not show significant systematic differences.