Articles

SINGLE PARAMETER GALAXY CLASSIFICATION: THE PRINCIPAL CURVE THROUGH THE MULTI-DIMENSIONAL SPACE OF GALAXY PROPERTIES

, , and

Published 2012 August 3 © 2012. The American Astronomical Society. All rights reserved.
, , Citation M. Taghizadeh-Popp et al 2012 ApJ 755 143 DOI 10.1088/0004-637X/755/2/143

0004-637X/755/2/143

ABSTRACT

We propose to describe the variety of galaxies from the Sloan Digital Sky Survey by using only one affine parameter. To this aim, we construct the principal curve (P-curve) passing through the spine of the data point cloud, considering the eigenspace derived from Principal Component Analysis (PCA) of morphological, physical, and photometric galaxy properties. Thus, galaxies can be labeled, ranked, and classified by a single arc-length value of the curve, measured at the unique closest projection of the data points on the P-curve. We find that the P-curve has a "W" letter shape with three turning points, defining four branches that represent distinct galaxy populations. This behavior is controlled mainly by two properties, namely u − r and star formation rate (from blue young at low arc length to red old at high arc length), while most other properties correlate well with these two. We further present the variations of several important galaxy properties as a function of arc length. Luminosity functions vary from steep Schechter fits at low arc length to double power law and ending in lognormal fits at high arc length. Galaxy clustering shows increasing autocorrelation power at large scales as arc length increases. Cross correlation of galaxies with different arc lengths shows that the probability of two galaxies belonging to the same halo decreases as their distance in arc length increases. PCA analysis allows us to find peculiar galaxy populations located apart from the main cloud of data points, such as small red galaxies dominated by a disk, of relatively high stellar mass-to-light ratio and surface mass density. On the other hand, the P-curve helped us understand the average trends, encoding 75% of the available information in the data. The P-curve allows not only dimensionality reduction but also provides supporting evidence for the following relevant physical models and scenarios in extragalactic astronomy: (1) The hierarchical merging scenario in the formation of a selected group of red massive galaxies. These galaxies present a lognormal r-band luminosity function, which might arise from multiplicative processes involved in this scenario. (2) A connection between the onset of active galactic nucleus activity and star formation quenching as mentioned in Martin et al., which appears in green galaxies transitioning from blue to red populations.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

In order to constrain the physical processes driving galaxy evolution, it is common practice to measure a number of physical properties for a set of galaxies and then investigate the correlations between these parameters. In this context, galaxy surveys have become more and more appropriate. The number of galaxies available is increasing, and the amount of information to constrain physical properties is also increasing, yielding more accurate estimates. The level of precision of these estimates is also likely to increase in the future, either with the combination of wide angle surveys observing at different wavelengths or with panchromatic surveys using a large number of filters (e.g., PAU, Benítez et al. 2009), which will benefit from multiband imaging for millions of galaxies. As this data deluge is turning astronomy into a data intensive or e-science (see Hey et al. 2009), one is confronted with the issue of being able to analyze the feature space, the dimensionality of which keeps increasing. In the face of such large number of physical properties, one wants to find the minimal and most important set which describes galaxies accurately. In this context, a common approach used to reduce the dimensionality of these data sets is performing a Principal Component Analysis (PCA; also known as Karhunen-Loève transform; see, e.g., Efstathiou & Fall 1984; Murtagh & Heck 1987). PCA enables us to find an uncorrelated and orthonormal set of linear combinations of properties (eigenvectors) that describe optimally the correlations and variation of the data. This approach has been fruitfully used in astronomy to classify galaxies and quasars based on their spectra (Connolly et al. 1995; Yip et al. 2004a, 2004b). PCA has be applied on a wider basis using various galaxy properties, such as the equivalent width of emission lines (Győry et al. 2011) or a mix of spectral and morphological features (Coppa et al. 2011), to help characterize the galaxy population. PCA also proved useful, for instance, when applied to stellar synthesis population models to derive galaxy physical parameters (Chen et al. 2012). PCA, however, does not enable us to capture all the information contained in the input sample. It is by nature linear, and hence cannot describe nonlinear correlations within the data. Other methods, such as applying locally linear embedding to galaxy spectra (Roweis & Saul 2000; Vanderplas & Connolly 2009), enable us to take into account nonlinearities, as they map high-dimensional data onto a surface while preserving the local geometry of the data.

In this paper, we introduce the principal curve (P-curve, see, e.g., Einbeck et al. 2007, for a review), which can be seen as a nonparametric extension of linear PCA. The principal curve is the curve following the location of the local mean in the multi-dimensional cloud of data points. In practice, the P-curve can be conveniently built in the PCA eigenspace spanned by the most important eigenvectors along which the variance is highest. The important fact here is that every data point can be assigned a unique closest projection onto the curve and can be labeled by the arc-length value measured from the beginning of the curve to the projection. This reduces the complexity of multi-dimensional data effectively into only one dimension. Moreover, the ranking of galaxies according to their associated arc-length values provides a natural and objective way of ordering, partitioning, and classifying the rich zoo of galaxies in the nearby universe.

In this paper, we take advantage of the wealth of data and build the principal curve for both physical and photometric properties belonging to the low-redshift Main Galaxy Sample (MGS; Strauss et al. 2002) in Sloan Digital Sky Survey (SDSS; Stoughton et al. 2002). Since the MGS is flux limited, the Malmquist bias underestimates the volume density of faint galaxies compared to that of brighter ones. As a result, the common practice of performing a simple PCA for all galaxies does indeed provide a biased result toward the behavior of the properties of bright objects. As a solution, we do not restrain the statistics by constructing a much smaller volume-limited sample, but instead keep all galaxies by assigning them weights with which we perform weighted PCA (WPCA) and P-curve methods. We then investigate how the arc length associated with each galaxy correlates with a number of photometric, spectroscopic, and physical galaxy properties, as well as morphology, mean spectra, and a first (luminosity function, LF) and second (clustering) moments of galaxies. Our results show that the arc-length values remarkably encode a large number of well-known trends in the local universe.

This paper is organized as follows: In Section 2 we present the data set we use. Section 3 details the galaxy properties we include in our PCA analysis. Section 4 presents the methods we use for the dimensionality reduction, WPCA, and principal curve. We detail in Section 5 how we build the principal curve from the SDSS data. In Section 6 we present our results and discuss them in Section 7.

We use in this paper a flat Λ cosmology assuming {Ωλ, ΩM, h0, w0} = {0.7, 0.3, 0.7, −1}.

2. THE GALAXY SAMPLE

In this paper we use photometric and spectroscopic data of galaxies from SDSS-DR8 (Aihara et al. 2011), available in an MS-SQL Server database queried online via CasJobs.3

In particular, we use the MGS (Strauss et al. 2002). These galaxies constitute a flux-limited sample with an r-band Petrosian apparent magnitude cut of mr ⩽ 17.77 and a redshift distribution peaking at z ∼ 0.1. Their spectra cover the rest-frame range of 3800–8000 Å, with a resolution of 69 km s−1 pixel−1.

Several selection cuts and flags were enforced in order to have a clean sample. We selected only science primary objects appearing in calibrated images having the photometric status flag. Also, we selected imaging fields where $0.6 \le {\tt score} \le 1.0$, which assures good imaging quality with respect to the sky flux and the point spread function's (PSF's) width. Furthermore, we neglected individual objects with bad deblending (with flags PEAKCENTER, DEBLEND_NOPEAK, NOTCHECKED) and interpolation problems (PSF_FLUX_INTERP, BAD_COUNTS_ERROR) or suspicious detections (SATURATED NOPROFILE).4 Also, we chose galaxies whose spectral line measurements and properties are labeled as RELIABLE.

The sky footprint of the clean spectroscopic survey builds up from a complicated geometry defined by sectors, whose aggregated area covers ∼7930 deg2 or a fractional area FA ≃ 0.192 of whole sky. We choose a redshift window of [z1, z2] = [0.02, 0.08]. The lower limit avoids including large photometrically cumbersome galaxies on the sky, and the upper limits reduce the amount of evolution of galaxy properties (Δt < 0.78 Gyr) while keeping the statistics high. Redshift incompleteness arises from the fact that two 3'' aperture spectroscopic fibers cannot be put together closer than 55'' in the same plate. As a strategy, denser regions in the sky are given a greater number of overlapping plates. Nevertheless, 7% of the initial galaxies photometrically targeted as MGS did not have their spectra taken.

We further construct a magnitude-limited sample, on which we will center our main study. Here, extinction-corrected Petrosian apparent magnitude cuts of [mr, 1, mr, 2] = [13.5, 17.65]) are applied. The lower limit is set due to the arising cross talk from close fibers in the spectrographs, when they contain light from very bright galaxies. The upper limit safely avoids the slight variations of the limiting apparent magnitude around 17.77 over the sky in the targeting algorithm. This leaves us with 174,266 galaxies.

A volume-limited subsample was also created, being a subset of the previous magnitude-limited sample. This subsample is used for the study of spatial correlation functions in Section 6.4. The redshift ranges are [z1, z2] = [0.02, 0.05], with an absolute magnitude window of [Mr, 1, Mr, 2] = [ − 21.19, −19.08], which leave us with ∼40, 000 galaxies.

3. SELECTING GALAXY PROPERTIES

Galaxies present a variety of physical, spectroscopic, and five-band photometric properties made available in the SDSS-DR8 data catalog. We selected the most relevant in order to create a p-dimensional cloud of properties or features for further study.

Among the included photometry-derived properties are the colors, which show the coarse shape of galaxy spectra, and in some extend the age of the overall stellar population in the galaxy. Only the colors u − r and g − r were selected, since most of the color combinations possible from the u, g, r, i, z bands (Fukugita et al. 1996) are highly correlated. For computing colors, extinction-corrected model magnitudes (Stoughton et al. 2002) are used, as well as k-corrections to an observing rest frame of z = 0. The k-corrections are calculated using a template fitting technique used in, e.g., Budavári et al. (2000) and Csabai et al. (2000). Here, the colors are matched to the colors of a model spectrum defined by a non-negative linear combination of redshifted template spectra. Then, the best model spectrum is blueshifted back to the rest frame (z = 0) and the k-correction is computed. The template spectra are drawn from a list provided by Bruzual & Charlot (2003).

Since we study the LF as a function of position in this cloud (Section 6.3), we decided not to include the absolute magnitude Mr as a property. If we did, any partitioning of the cloud would introduce undesired artificial cuts in the range of absolute magnitudes used in the computation of LFs. Therefore, neither the absolute magnitude nor any other strongly correlated property of it (such as stellar mass) should be chosen as part of the properties.

Another photometry-derived feature is the concentration index CR90r/R50r, where R90r and R50r are the radii enclosing the 90% and 50% of the r-band Petrosian flux, respectively. This index has been found to correlate with galaxy morphological type (Strateva et al. 2001; Shimasaku et al. 2001). Indeed, de Vaucouleurs light profiles of elliptical galaxies are more concentrated than the exponential profile in the disks of spiral galaxies.

The redshift-dependent r-band surface brightness defined by μ50, r = mr + 2.5log [2πR502r(1 + z)4] is also included as a property. This breaks the degeneracy of R90r/R50r between bright and dim spiral galaxies. Here we use the extinction and k-corrected Petrosian apparent magnitude mr, taking $\sqrt{2}R50_{r}$ as a less noisy proxy for the Petrosian radius (Stoughton et al. 2002; Strauss et al. 2002).

The physical properties selected are the star formation rate (SFR), specific star formation rate (SFR/M*, where M* is the stellar mass), and Petrosian r-band mass-to-light ratio (M*/Lr). These are included in SDSS-DR8 and obtained from galaxy spectra analysis at MPA and JHU,5 as detailed in Kauffmann et al. (2003b), Brinchmann et al. (2004), Tremonti et al. (2004), Gallazzi et al. (2005), and Salim et al. (2007).

Note that M* has been derived from template fitting to the total flux in the five photometric bands (Aihara et al. 2011). As the spectral fibers' diameters cover only 3'' of the central part of each galaxy, the SFR had to be corrected for this deficiency to its full value (Brinchmann et al. 2004).

Since other spectral features such as lick indices or line equivalent widths are non-trivial to extrapolate from their fiber values to the full galaxy ones, we do not include these in the building of the cloud of properties. We do, however, study them separately in Sections 6.2 and 7.

4. METHODS FOR DIMENSIONALITY REDUCTION

Most of the time, data mining deals with the data point matrix A = [A1A2...Ap] $\in \mathbb {R}^{N\times p}$, composed by the columns $\lbrace \mathbf {A}^{i} \rbrace _{i=1}^{p}$ that contain N observations for each of the p properties or features. Thus, A can then be thought of as a length-N realization of the random vector $\mathbf {X}=[X_{1} \ldots X_{p}] \in \mathbb {R}^{p}$ with distribution DX(x).

In our work, dimensionality reduction is used for explaining the variations of X as a function of only one parameter. For that effect, we use WPCA and principal curves, detailed descriptions of which are included in Sections 4.1 and 4.2.

4.1. Weighted PCA (WPCA)

PCA (Pearson 1901; Jackson 1991; Jolliffe 2002), also known as Karhunen-Loève transform, is a widely used method for dimensionality reduction and classification. It can be seen as a transformation involving a translation, linear scaling, and rigid rotation of a collection of N p-dimensional data points onto a new coordinate system. The new orthonormal axes, or principal components $\lbrace {\mathbf{PC}}_{i} \rbrace _{i=1}^{p} \in \mathbb {R}^{N \times 1}$, are constructed such that the projections of the data points on the PCi are uncorrelated. PC1 is selected as the axis on $\mathbb {R}^{p}$ which has the highest possible variance of the points projected onto it. The next PCi are ordered in descending value of the variance, PCp being the lowest. Thus, dimensionality reduction is attained by describing the data in terms of the most important principal components (Hastie et al. 2009). This can be obtained by considering only the space spanned by the first qp variance-ranked eigenvectors whose cumulative variance reaches above a high enough threshold.

In practice, the PCs and their variances can be found using singular value decomposition (SVD) of the covariance matrix C of the data points (Golub & Van Loan 1996). SVD allows us to factorize it in the form C = VΣVT. Here, Σ is a diagonal matrix with the eigenvalues (variances), and V contains the eigenvectors (principal components) in the respective columns. Thus, V contains the expansion coefficients of the transformation ${\mathbf{PC}}_{i} = \Sigma _{j=1}^{j=p} \mathbf {V}_{ji} \mathbf {x}_{j}$ (i = 1, ..., p) from property space to PC space.

In WPCA, the covariance matrix is calculated in a weighted schema. Many times we are confronted with noisy or missing data points. As a solution, we can assign a weight wi > 0 to each ith data point in order to account for the noisy or missing data points. In this context, WPCA involves considering these weights in the calculation of all averages and covariances between the p properties. In general, the properties might have different units; therefore, they first have to be made unitless by standardization of the data points (subtract from each property its (weighted) average and then divide it by its [weighted] standard deviation).

4.2. Principal Curves

Principal curves (P-curves) and surfaces (P-surfaces) (Hastie 1984; Hastie & Stuetzle 1989; Tibshirani 1992; Gorban et al. 2008) go one step farther than PCA, providing a low-dimensional curved manifold that passes through the middle of the data points. In this paper we consider a one-parameter (called l) principal curve $\mathbf {f}(l)=[f_{1}(l), \ldots,f_{p}(l)] \in \mathbb {R}^{N \times 1}$, where each of the N data points x = [x1...xp] is given a unique closest projection f(lf(x)) onto the curve. As a convention, lf(x) is chosen to be the arc length from the beginning of the curve to the projection point of x. In this context, the P-curve can be considered by itself the first and only curved principal component, as the dimensionality of the data is reduced from p to one dimension. In practice, the P-curve is composed by N − 1 line segments that connect the projection points.

The principal curve is defined as the average of the data points that project onto it, minimizing the projection distance between x and f(lf(x)) over all points. This property of self-consistency allows us to follow a series of iterative projection-expectation steps for its construction (Hastie & Stuetzle 1989). In fact, an educated first guess for the P-curve is to make it equal to PC1. Later on, the jth estimate f(j)i(l) of the curve at the jth expectation step is calculated as ${f}^{(j)}_{i}(l) = E(\mathbf {X}_{i}\vert {l}_{\mathbf {f}^{(j-1)}}(\mathbf {X})= l)$. In practice, we compute this expression using a weighted penalized cubic B-spline regression (Silverman 1985; Hastie & Tibshirani 1990; Rupert et al. 2003; Hastie et al. 2009). These splines are calculated on a series of k knots chosen from the data points, while the degrees of freedom (df) of the regression control the degree of smoothing of the P-curve. The jth projection step is performed next, involving the search for the closest perpendicular projection of x onto f(j)(l), which is composed of the N − 1 line segments. The iterations stop when the cumulative projection distances from the data points to the P-curve do not change significantly with respect to the one in the previous step.

Although P-curves are constructed on the p-dimensional space of properties, we can consider building the P-curve of the data points projected on the first q most important principal components of the WPCA. This minimizes the complexity and computations, especially in the case of pq, without losing much information. The approximation is of course valid as long as the first q eigenmodes contain as much of the total variance as possible.

5. BUILDING THE PRINCIPAL CURVE AND POPULATION SEPARATORS ALONG ARC LENGTH

5.1. Vmax Weighting

As the MGS is a magnitude-limited sample, not all galaxy types are sampled equally in the survey volume. As a consequence, we used WPCA and a weighted principal curve of the galaxy population to get an unbiased result.

In detail, at higher redshifts we sample mostly the brightest galaxies, neglecting the faint ones (Malmquist bias). On the other side, at low redshifts the SDSS spectrograph fails to take the spectra of very bright galaxies (see Section 2).

As a solution, we use the Vmax weighting method (Schmidt 1968) to account for this incompleteness. Here, each ith galaxy is assigned a weight wi = VS/Vmax, i ⩾ 1, where VS is the volume of the survey. Here we note that, given the particular [z1, z2] and [m1, m2] intervals for the survey, the ith galaxy found at zi could be observed only within a maximum comoving volume Vmax, iVS. If the ith galaxy of apparent magnitude mi, k-correction ki = k(zi), and at a luminosity distance DL(zi) were to have limiting apparent magnitudes m1, 2, then it should be moved to a limiting luminosity distance DL, i(m1, 2) given by

Equation (1)

Hence, the maximum volume is defined by the biggest interval of DL inside which a galaxy can appear in the survey:

Equation (2)

As Equation (1) defines zlim in an implicit way, we solve for it iteratively. We calculated the Vmax values directly inside the database using an integrated cosmological functions library (Taghizadeh-Popp 2010).

The PCA, P-curve, and calculations related to volume densities in this paper (such as histograms) are all Vmax weighted.

5.2. WPCA Results

As a measure to avoid skewing the PCA, we clipped off visually the outliers in each of the p = 7 galaxy properties in order to dismiss artifacts or incorrect measurements. We also used only galaxies which have all seven properties measured, simplifying the calculations and avoiding using Gappy PCA (Connolly & Szalay 1999). This left us with a total of N = 171, 698 galaxies (99% of the initial ones).

Figure 1 and Table 1 present the results from computing WPCA on the seven galaxy properties. From Table 1, we can note that most of the information (97% of the total variance) is contained in the first four principal components. In Figure 1, each PCi i = 1, ..., 7 can be viewed as a linear combination of properties, with the expansion coefficients Vji of the jth property stored in the jth row. Coefficients with stronger color show a higher importance of the property for the given PC. The sign of the coefficient shows correlations/anticorrelations between the properties and the final value of the PC.

Figure 1.

Figure 1. V matrix resulting from applying WPCA to the seven galaxy properties. The columns are the orthonormal principal components (i.e., eigenvectors of the covariance matrix). Each PCi i = 1, ..., 7 can be viewed as a linear combination of properties, with the expansion coefficients Vji of the jth property stored in the jth row. Coefficients with stronger color show a higher importance of the property for the given PC. The sign of the coefficient shows correlations/anticorrelations between the properties and the PC.

Standard image High-resolution image

Table 1. WPCA Variances

  PC1 PC2 PC3 PC4 PC5 PC6 PC7
σ2PC 4.493 1.115 0.842 0.363 0.090 0.068 0.030
$\sum \frac{\sigma _{\mathrm{PC}}^2}{p}$ 0.642 0.801 0.921 0.973 0.986 0.996 1.000

Notes. Variance for each principal components and its associated cumulative variance. Since the data have been standardized, the sum of the variances is equal to the number of dimensions (p = 7).

Download table as:  ASCIITypeset image

For PC1, the strength (absolute magnitude) of its expansion coefficients Vj1 in the basis of the galaxy properties is shared mostly evenly between these properties, u − r, g − r, SFR/M*, and M*/Lr being the most important. The correlations show that high values of u − r, g − r, M*/Lr, and R90r/R50r, together with low values of μ50, r, SFR/M* and SFR, will produce a high PC1 value. We might therefore expect that PC1 is a good separator between the young, blue population of spirals/irregulars and the old population of red old ellipticals.

For PC2, the SFR and μ50, r are the most important, having opposite signs. Thus, we expect galaxies with bright surface brightness and high star formation to show high values of PC2.

For PC3, the most important property is R90r/R50r, with an opposite correlation with respect to the next important properties of mostly equal strength (u − r, g − r, SFR, M*/Lr, and μ50, r). We can expect that big and bright star-forming spiral galaxies with reddish colors (probably from a red core) should have high PC3.

For PC4, all the properties have the same correlations, μ50, r, R90r/R50r, and SFR being the most important. Thus, concentrated (and possibly star-forming) galaxies of faint surface brightness have high values of PC4. As the variance along PC4 is much smaller than along the previous PCs, this is a rare combination of correlation for these properties to be observed at the same time.

Furthermore, the last three PCs (PC5, PC6, and PC7), which account for less than 2% of the total variance, are less obvious to interpret. They might trace either special cases of galaxy populations or just artifacts and wrong/noisy measurements of the properties.

5.3. The Fitted Principal Curve and Population Separators along It

We decided to construct the principal curve in the four-dimensional space defined by {PC1, ..., PC4}, since their combined cumulative variance (0.973) is close to unity (see Table 1). Although the computation for the number of dimensions and data points involved is not too intensive, we think of this as a pedagogical example that can be used for other extreme cases when N ⩾ 1010 objects with p ⩾ 100 dimensions, for instance. In fact, our election does not change significantly the results compared to using p = 7.

In the expectation step for creating the principal curve, each PCi is fitted with penalized B-splines of 5.4 degrees of freedom (df), defined at a sequence of k = 211 unique knots chosen at equally spaced quantiles of arc-length values. Principal curves with df ≳ 7 make the curve oscillate excessively, turning back and forth across and along PC1, whereas with df ≃ 4 it resembles more closely a straight line along the PC1 direction.

Figure 2 shows the result of fitting the principal curve to {PC1, ..., PC4}. The four-dimensional cloud of properties presents two density maxima placed mainly along the PC1 direction, corresponding to the blue and red galaxy populations. The principal curve closely resembles the letter "W," presenting clearly four different regimes or branches separated by three turning points (T-points).

Figure 2.

Figure 2. Principal curve (black continuous line) fitted to the first four principal components (density maps are log scaled, with contour curves separated by 0.5 dex). The arc length increases in the direction of increasing PC1. The first and last 15 data points (ordered by arc length) are connected to their corresponding projections on the curve with dashed lines. The separators between the {Li}i = 20i = 1 groups are shown as black circles on top of the curve (see text). The curve presents three turning points marked as brown rings. The colored arrows show the directions and relative strength of the galaxy properties projected onto PC space. In PC3 vs. PC4, we plotted the separating line between the main cloud and a small blob of galaxies, given by PC4 ⩽ −1.3 + 0.55(PC3 − 2.0) (see Section 6.5.1).

Standard image High-resolution image

We created 20 equal number density galaxy groups (in Mpc−3) labeled as {Li}i = 20i = 1 by placing population separators at fixed arc-length values along the P-curve, as shown in Figures 2 and 3. Galaxies are grouped together into the same Li group when the arc-length values measured at their projection points on the P-curve are placed between two consecutive separators. These separators are positioned in such a way that the (Vmax-weighted) number density (in Mpc−3) of the galaxies belonging to each of the 20 Li groups amounts to 1/20th of that from the whole sample of galaxies. This allowed us to study the four principal curve branches in detail. We chose the arc length to increase in the same direction of increasing PC1, with growing values of arc length as we progress from L1 to L20. Thus, the P-curve's first branch is comprehended in {L1, ..., L6}, the second branch in {L7, ..., L14}, the third branch in {L15, ..., L18}, and the fourth branch in {L19, L20}. Table 2 shows some statistics of these groups.

Figure 3.

Figure 3. Probability density of the arc-length values {li}i = Ni = 1 measured at the points of the curve where the N data points are projected onto. The arc length increases in the direction of increasing PC1. Vertical black continuous lines denote the population separators, while the numbers denote the {Li}i = 20i = 1 galaxy groups. The five small tick marks within each galaxy group mark the boundaries of the subgroups {λ1, ..., λ100}.

Standard image High-resolution image

Table 2. Statistics of the {Li}i = 20i = 1 Galaxy Groupa

Group Ngal lmin lmax l d
L1 4050 0 4.12 3.54 1.57
L2 2987 4.12 4.67 4.41 1.34
L3 2368 4.67 5.09 4.87 1.25
L4 2136 5.09 5.54 5.32 1.14
L5 1589 5.54 5.88 5.72 1.12
L6 1345 5.88 6.11 6.01 1.17
L7 1674 6.11 6.31 6.21 1.27
L8 2190 6.31 6.55 6.44 1.29
L9 3568 6.55 6.87 6.71 1.12
L10 6287 6.87 7.25 7.06 1.16
L11 10196 7.25 7.67 7.46 1.21
L12 13862 7.67 8.12 7.89 1.31
L13 17062 8.12 8.63 8.37 1.39
L14 18283 8.63 9.24 8.93 1.51
L15 15287 9.24 9.99 9.6 1.56
L16 9421 9.99 10.82 10.39 1.50
L17 5546 10.82 11.45 11.16 1.51
L18 9610 11.45 12.21 11.82 1.34
L19 19877 12.21 13.09 12.64 1.12
L20 24360 13.09 20.24 13.69 1.11
All 171698 0 20.24 7.91 1.31

Notes. aNgal denotes the number of galaxies in each group, comprehended in the arc-length interval [lmin, lmax] of 〈l〉 average arc length. The value d denotes the quadratic mean (root mean square) of the projection distances from the data points onto the P-curve.

Download table as:  ASCIITypeset image

Within each Li group, we further created five subgroups of galaxies along the arc length naming them {λi}i = 100i = 1, also of equal number density in Mpc−3 as explained before. We further partitioned these groups similarly, now using several radial separators in the direction perpendicular to the curve, defining 10 concentric cylinder-like separating surfaces. In this way, the groups defined by this finer partitioning have all the same number density (in Mpc−3), equal approximately to 1/1000th of the number density of the whole sample. This allowed us to identify and extract localized galaxy populations positioned very close to the spine of the cloud of properties; we study them in Section 6.5.2.

Figure 3 shows the probability density distribution of the arc-length l values as well as the population separators. The curve has a length of lmax = 20.24, and the variance of the arc-length values is σ2l = 7.79, measured with respect to the center of the curve at 〈l〉 = 7.91. Note that Table 2 shows that the quadratic mean (root mean square) of all the projection distances from the data points to the P-curve takes a value of d = 1.31, which is small compared to the length of the curve. The blue and red peaks of maximum density are clearly visible, as well as a small green peak located between the previous two. The first turning point (at L6) lies closely with the blue maximum (L7), whereas the red maximum (L17) is a little behind the third T-point (L18), after which we find a hump defining the red sequence of galaxies. We find a green maximum (L16) standing between the second T-point (L14) and the red maximum.

Figure 4 shows the density maps of the scatter of each {PC1, ..., PC4} as a function of the arc length. The different shapes that this scatter presents depend evidently on the contortions or twists of the principal curve along the PCs. As the four branches of the curve mostly turn left and right along PC2, the scatter in PC2 shows the same "W" shape as the P-curve. On the other hand, the curve increases its length in the PC1 direction, so the scatter shows a mostly linear relation between PC1 and arc length. The same analysis applies to the scatter of the next PCs, which is boomerang shaped for PC3 and mostly constant with respect to arc length for PC4 (although with little wiggles).

Figure 4.

Figure 4. Density maps of the principal components (y-axis) as a function of the arc-length l (x-axis). Density is log scaled, with contour curves separated by 0.5 dex. Population separators are shown as vertical tick marks. The numbers denote the {Li}i = 20i = 1 galaxy groups. Colored vertical lines show the position of the maxima and turning points at particular l values.

Standard image High-resolution image

6. GALAXY PROPERTIES AND STATISTICS AS A FUNCTION OF ARC LENGTH

In this section we show how galaxy properties, LFs, and spatial clustering change as a function of the {Li}i = 20i = 1 equal number density galaxy groups (ordered in ascending arc length).

Compared to PC1 alone, the principal curve provides much more information about particular changes in properties along its arc length. We will see that the evolution of galaxy properties along the curve is intimately related to the "W" shape of the principal curve, where each of the four branches defines particular galaxy populations.

6.1. Morphology and Average Spectra

Figure 5 shows the most representative galaxy morphologies and average spectra for the {Li}i = 20i = 1 groups.

Figure 5.

Figure 5. Top: panels with the four most representative galaxy shapes that appear in each of the {Li}i = 20i = 1 galaxy groups. Arc length increases from L1 to L20. White bars show the scale in arcseconds. Bottom: average rest-frame spectrum of the same 20 groups as above. The flux (y-axis) is normalized to be 1 at λ = 4000 Å. The average of each group was performed on 1000 galaxies sampled randomly with probabilities proportional to V−1max. Also marked are the positions of emission lines (top row) and absorption lines (bottom row).

Standard image High-resolution image

The most evident feature is the change in color and the slope of the spectra (from blue to red), as well as an overall weakening of emission lines (e.g., Balmer series of hydrogen and forbidden lines, such as O iii, O ii, N ii, etc.) and an increase of metallic absorption lines and bands (Na, Mg, H, K, G) as we reach high arc-length values. In the same way, morphological types include various types of blue galaxies at the beginning and middle of the curve, whereas red ellipticals dominate the end of it. This bimodality is expected and agrees with PC1 in Figure 1, also appearing in other studies as the change along the first principal component (e.g., Yip et al. 2004a; Coppa et al. 2011). We can, however, identify as well more subtle populations along the arc length, indistinguishable in PC1 alone. These distinct population are defined on each of the four branches of the principal curve, connected by the three turning points.

With respect to morphologies, we see that the arc length correlates very well with the Hubble galaxy type. We do, however, miss the distinction between barred/non-barred spiral galaxies due to the lack of properties able to separate them. Blue irregulars and blue compact dwarf (BCD) galaxies (Papaderos et al. 2006; Corbin et al. 2006) appear in the first branch of the principal curve. Some of these types of BCDs were identified as the green pea galaxies at higher redshift (Cardamone et al. 2009). These morphologies change then into low surface brightness galaxies (LSBGs) with spiral and irregular shapes, which dominate the first turning point and blue maximum. Bright spirals with strong blue star-forming arms appear in the second branch, which by the second turning point show sizable bulges. A dramatic change happens in the third branch, where reddish big-bulged spirals and lenticulars dominate, forming part of the green and red maxima. A new transition happens at the third turning point, at which the big bright red ellipticals (CDs) and brightest cluster galaxies (BCGs) dominate at the end of the P-curve's fourth branch.

Emission lines, such as the forbidden O ii, O iii, S ii, and Ne ii, as well as the Balmer series of hydrogen (e.g., Hα, Hβ, Hγ), are strong in the violently star-forming blue galaxies at the first branch. These lines weaken as we transition into LSBGs, but interestingly Hα and Hβ become stronger in the second branch, reaching maximal values in the star-forming spirals at the second turning point. After this, they weaken again to become imperceptible in the bright ellipticals in the fourth branch. N ii follows the same pattern as Hα, but somehow remains still visible in CD galaxies, as seen in many spectral atlases (e.g., Dobos et al. 2012). On the other hand, O iii declines steadily through the arc length, disappearing after the red maximum.

Absorption lines, such as Na, Mg, and the G band, become evident in the star-forming spirals by the end of the second branch (as the bulge increases in size), and appear strong in the ellipticals at the fourth branch. Although the H and K lines of calcium are always visible, the 4000 Å break increases steadily with arc length, turning into a striking feature in bright ellipticals.

6.2. Spectral and Physical Properties

Figure 6 shows the evolution of the galaxy properties as a function of arc length in the principal curve, whereas Table 3 contains the average values of the properties at each {Li}i = 20i = 1.

Figure 6.

Figure 6. Galaxy properties (y-axis) as a function of arc length in the principal curve (x-axis). Properties are ordered row-wise and grouped with respect to which principal component they most resemble as in Figure 4 (upper right hand corner of each panel). The PC1 case resembles a straight line, PC2 a "W," and PC3 a boomerang. The black circles at the mean arc-length value within each {Li}i = 20i = 1 group show the position of the median of the distribution of the property in it, together with vertical bars spanning the 15.9%–84.1% quantiles (±1σ). The orange and cyan bars show respectively the same quantiles for the red spine and red spiral blob galaxies discussed in Section 6.5.

Standard image High-resolution image

Table 3. Medians of Galaxy Property Distributions in Each Galaxy Group, Together with the 15.9%–84.1% Quantiles (±1σ)

Group log M*/Lr log SFR/M* u − r g − r r − i gz Dn(4000) Lick G4300 Lick Fe4531 Lick Mg2 Lick Na D Lick HδA
  (ML−1☉, r) (yr−1)           (Å) (Å)   (Å) (Å)
L1 −0.46+0.19− 0.21 −9.26+0.41− 0.24 1.01+0.17− 0.24 0.21+0.09− 0.11 0.13+0.07− 0.06 0.48+0.14− 0.18 1.10+0.06− 0.08 0.25+0.78− 0.70 1.42+0.82− 0.87 0.07+0.02− 0.02 1.72+1.04− 0.78 2.91+1.95− 2.96
L2 −0.38+0.16− 0.15 −9.45+0.25− 0.20 1.16+0.15− 0.17 0.27+0.08− 0.09 0.16+0.05− 0.06 0.56+0.12− 0.16 1.13+0.06− 0.06 0.39+1.04− 1.17 1.62+0.98− 1.07 0.07+0.03− 0.02 1.48+0.65− 0.95 3.72+1.49− 2.37
L3 −0.38+0.12− 0.14 −9.51+0.24− 0.15 1.21+0.12− 0.15 0.28+0.07− 0.07 0.15+0.05− 0.05 0.57+0.11− 0.15 1.16+0.06− 0.07 0.63+1.18− 1.19 1.48+0.94− 1.32 0.06+0.03− 0.02 1.41+0.91− 0.99 4.35+1.45− 1.58
L4 −0.38+0.10− 0.12 −9.50+0.18− 0.18 1.23+0.13− 0.15 0.28+0.06− 0.06 0.16+0.04− 0.04 0.55+0.11− 0.13 1.17+0.07− 0.06 0.72+1.28− 1.41 1.53+1.53− 1.46 0.06+0.03− 0.03 1.21+1.10− 1.00 4.25+1.51− 1.76
L5 −0.38+0.08− 0.09 −9.51+0.15− 0.17 1.27+0.09− 0.14 0.28+0.05− 0.05 0.16+0.04− 0.04 0.54+0.12− 0.16 1.18+0.08− 0.07 0.72+1.89− 1.98 1.60+1.99− 2.23 0.06+0.03− 0.04 1.18+1.42− 1.61 4.25+1.92− 2.02
L6 −0.32+0.06− 0.07 −9.62+0.12− 0.17 1.34+0.09− 0.10 0.32+0.03− 0.04 0.17+0.03− 0.04 0.59+0.09− 0.11 1.21+0.10− 0.08 1.30+2.33− 2.41 1.58+2.13− 2.27 0.07+0.05− 0.05 0.92+1.54− 1.84 4.23+2.09− 2.41
L7 −0.26+0.06− 0.05 −9.76+0.13− 0.26 1.43+0.14− 0.09 0.36+0.03− 0.04 0.19+0.03− 0.04 0.64+0.10− 0.10 1.23+0.11− 0.08 1.33+2.43− 2.88 2.13+2.68− 2.70 0.07+0.04− 0.05 1.21+1.68− 1.83 3.61+2.15− 2.16
L8 −0.19+0.08− 0.05 −9.87+0.16− 0.43 1.51+0.16− 0.12 0.40+0.07− 0.04 0.20+0.03− 0.04 0.72+0.10− 0.08 1.25+0.12− 0.10 1.83+1.60− 2.26 2.33+2.11− 2.44 0.08+0.04− 0.04 1.08+1.17− 1.22 3.81+1.88− 3.04
L9 −0.15+0.07− 0.07 −9.90+0.17− 0.22 1.53+0.18− 0.13 0.41+0.06− 0.05 0.23+0.03− 0.03 0.78+0.08− 0.09 1.25+0.09− 0.07 1.41+1.87− 1.60 2.06+1.70− 1.93 0.08+0.03− 0.03 1.19+1.05− 1.11 3.57+1.82− 1.89
L10 −0.12+0.10− 0.08 −9.91+0.19− 0.25 1.54+0.17− 0.15 0.42+0.06− 0.06 0.24+0.03− 0.03 0.81+0.10− 0.09 1.25+0.09− 0.07 1.61+1.64− 1.50 1.99+1.47− 1.44 0.08+0.03− 0.03 1.31+0.89− 0.94 3.65+1.51− 1.83
L11 −0.09+0.10− 0.09 −9.91+0.18− 0.23 1.56+0.17− 0.15 0.43+0.07− 0.06 0.27+0.03− 0.03 0.85+0.11− 0.09 1.26+0.09− 0.07 1.53+1.46− 1.23 2.03+1.21− 1.32 0.09+0.03− 0.03 1.41+0.77− 0.78 3.55+1.37− 1.46
L12 −0.04+0.08− 0.08 −9.94+0.18− 0.20 1.59+0.16− 0.17 0.45+0.06− 0.06 0.29+0.03− 0.04 0.91+0.10− 0.10 1.26+0.09− 0.07 1.51+1.33− 1.15 2.03+1.13− 1.03 0.09+0.03− 0.02 1.61+0.66− 0.66 3.31+1.26− 1.48
L13 0.06+0.08− 0.08 −10.02+0.21− 0.20 1.72+0.15− 0.17 0.51+0.05− 0.06 0.33+0.04− 0.05 1.02+0.09− 0.11 1.29+0.12− 0.08 1.77+1.38− 1.13 2.13+0.97− 0.95 0.10+0.04− 0.03 1.91+0.65− 0.66 2.92+1.26− 1.54
L14 0.20+0.11− 0.09 −10.19+0.26− 0.22 1.94+0.17− 0.17 0.60+0.05− 0.05 0.37+0.05− 0.05 1.18+0.12− 0.12 1.35+0.13− 0.10 2.35+1.38− 1.24 2.37+0.93− 0.96 0.12+0.04− 0.04 2.29+0.79− 0.70 2.35+1.48− 1.54
L15 0.29+0.15− 0.14 −10.50+0.27− 0.29 2.15+0.21− 0.21 0.66+0.08− 0.08 0.38+0.07− 0.06 1.28+0.19− 0.16 1.44+0.17− 0.13 3.17+1.32− 1.57 2.65+0.99− 1.22 0.13+0.05− 0.04 2.38+0.94− 0.90 1.77+1.95− 1.74
L16 0.23+0.18− 0.15 −10.75+0.32− 0.39 2.15+0.26− 0.25 0.65+0.09− 0.08 0.35+0.08− 0.05 1.20+0.23− 0.18 1.49+0.19− 0.15 3.67+1.56− 1.67 2.87+1.02− 1.38 0.13+0.05− 0.04 1.99+1.04− 0.93 1.50+2.10− 2.14
L17 0.20+0.13− 0.11 −11.11+0.22− 0.35 2.20+0.17− 0.13 0.66+0.05− 0.05 0.33+0.04− 0.03 1.18+0.12− 0.10 1.60+0.15− 0.17 4.47+1.43− 1.67 3.08+1.28− 1.36 0.15+0.04− 0.04 1.83+0.86− 0.80 0.43+1.98− 2.03
L18 0.27+0.09− 0.10 −11.58+0.27− 0.34 2.35+0.15− 0.13 0.69+0.05− 0.04 0.35+0.04− 0.03 1.24+0.09− 0.09 1.70+0.14− 0.14 4.97+1.01− 1.19 3.20+0.84− 0.99 0.17+0.04− 0.04 2.14+0.88− 0.73 −0.55+1.75− 1.40
L19 0.34+0.08− 0.07 −11.93+0.38− 0.31 2.48+0.13− 0.13 0.73+0.04− 0.04 0.38+0.03− 0.03 1.32+0.08− 0.07 1.83+0.11− 0.13 5.38+0.67− 0.82 3.34+0.61− 0.67 0.22+0.04− 0.04 2.94+0.77− 0.76 −1.41+1.35− 1.06
L20 0.38+0.08− 0.07 −12.19+0.41− 0.27 2.57+0.12− 0.12 0.76+0.04− 0.04 0.39+0.03− 0.02 1.37+0.08− 0.06 1.91+0.08− 0.11 5.51+0.45− 0.56 3.38+0.44− 0.48 0.25+0.03− 0.04 3.65+0.70− 0.71 −1.91+1.03− 0.72
Blob 0.61+0.25− 0.18 −11.99+0.40− 0.37 2.62+0.21− 0.16 0.75+0.04− 0.04 0.38+0.03− 0.03 1.34+0.09− 0.06 1.85+0.11− 0.11 5.49+0.57− 1.07 3.43+0.46− 0.47 0.23+0.03− 0.04 3.18+0.86− 0.77 −1.73+1.32− 0.94
Red 0.40+0.03− 0.03 −12.36+0.14− 0.14 2.62+0.05− 0.06 0.77+0.02− 0.02 0.39+0.02− 0.02 1.38+0.04− 0.04 1.95+0.06− 0.06 5.59+0.38− 0.38 3.45+0.42− 0.42 0.26+0.02− 0.02 3.87+0.54− 0.55 −2.21+0.64− 0.59
Spine                        
Group log O iii eclass log SFR μ50, r Mr log R90r log M* log μ*, r log Hα log N ii R90r/R50r fracDeVr
  (Å)   (M yr−1) (mag arcsec−2) (mag) (kpc) (M) (M kpc−2) (Å) (Å)    
L1 1.57+0.41− 0.41 0.44+0.17− 0.10 −0.51+0.46− 0.34 20.42+0.71− 0.59 −18.09+0.61− 1.15 0.45+0.23− 0.17 8.63+0.54− 0.36 8.35+0.35− 0.43 1.87+0.29− 0.27 0.89+0.28− 0.32 2.65+0.27− 0.23 0.55+0.37− 0.35
L2 1.35+0.38− 0.35 0.36+0.12− 0.09 −0.75+0.45− 0.30 20.85+0.86− 0.56 −17.87+0.62− 1.08 0.47+0.23− 0.17 8.64+0.45− 0.36 8.25+0.35− 0.46 1.68+0.26− 0.22 0.74+0.24− 0.24 2.53+0.24− 0.21 0.28+0.45− 0.22
L3 1.23+0.28− 0.32 0.32+0.09− 0.08 −0.82+0.41− 0.28 21.26+0.55− 0.48 −17.85+0.65− 0.95 0.51+0.21− 0.18 8.61+0.41− 0.32 8.09+0.29− 0.34 1.57+0.24− 0.21 0.61+0.25− 0.29 2.43+0.21− 0.21 0.14+0.25− 0.14
L4 1.12+0.29− 0.30 0.29+0.09− 0.07 −0.81+0.36− 0.24 21.76+0.51− 0.55 −17.85+0.49− 0.87 0.61+0.16− 0.17 8.61+0.37− 0.22 7.89+0.31− 0.31 1.49+0.22− 0.20 0.54+0.25− 0.24 2.33+0.24− 0.19 0.06+0.19− 0.06
L5 1.06+0.35− 0.31 0.26+0.10− 0.07 −0.89+0.34− 0.19 22.24+0.57− 0.60 −17.73+0.41− 0.79 0.66+0.13− 0.14 8.57+0.33− 0.21 7.70+0.29− 0.32 1.43+0.26− 0.22 0.48+0.25− 0.23 2.20+0.21− 0.18 0.02+0.14− 0.02
L6 0.82+0.40− 0.26 0.20+0.09− 0.08 −0.97+0.31− 0.20 22.52+0.54− 0.47 −17.71+0.41− 0.68 0.70+0.11− 0.15 8.61+0.26− 0.17 7.65+0.20− 0.22 1.27+0.26− 0.20 0.38+0.25− 0.22 2.12+0.17− 0.17 0.00+0.12− 0.00
L7 0.71+0.33− 0.32 0.16+0.07− 0.08 −0.98+0.31− 0.29 22.53+0.47− 0.52 −17.88+0.48− 0.69 0.70+0.13− 0.13 8.76+0.25− 0.19 7.72+0.20− 0.21 1.20+0.21− 0.23 0.37+0.22− 0.22 2.07+0.21− 0.17 0.00+0.14− 0.00
L8 0.57+0.34− 0.37 0.14+0.08− 0.08 −0.97+0.37− 0.56 22.24+0.40− 0.46 −17.95+0.68− 0.84 0.67+0.16− 0.15 8.86+0.31− 0.27 7.89+0.16− 0.14 1.13+0.26− 0.25 0.42+0.22− 0.23 2.18+0.24− 0.22 0.01+0.17− 0.01
L9 0.60+0.32− 0.38 0.14+0.08− 0.08 −0.87+0.44− 0.34 21.91+0.40− 0.53 −18.22+0.66− 0.92 0.67+0.19− 0.16 9.00+0.34− 0.24 8.07+0.16− 0.14 1.21+0.22− 0.26 0.50+0.22− 0.26 2.23+0.25− 0.21 0.00+0.18− 0.00
L10 0.55+0.36− 0.37 0.14+0.09− 0.08 −0.68+0.48− 0.43 21.59+0.47− 0.54 −18.64+0.88− 1.02 0.70+0.20− 0.21 9.20+0.37− 0.32 8.22+0.19− 0.15 1.25+0.24− 0.24 0.62+0.23− 0.23 2.27+0.22− 0.23 0.03+0.21− 0.03
L11 0.50+0.39− 0.41 0.14+0.09− 0.08 −0.43+0.45− 0.50 21.26+0.51− 0.56 −19.19+1.05− 0.97 0.74+0.20− 0.24 9.44+0.37− 0.39 8.38+0.18− 0.15 1.31+0.23− 0.25 0.72+0.23− 0.25 2.28+0.23− 0.25 0.04+0.23− 0.04
L12 0.43+0.43− 0.43 0.13+0.11− 0.10 −0.24+0.44− 0.49 20.86+0.56− 0.63 −19.63+1.21− 0.99 0.75+0.21− 0.28 9.67+0.39− 0.49 8.59+0.21− 0.18 1.37+0.25− 0.28 0.83+0.24− 0.25 2.30+0.28− 0.24 0.07+0.38− 0.07
L13 0.29+0.50− 0.38 0.08+0.12− 0.10 −0.04+0.42− 0.50 20.58+0.56− 0.65 −20.09+1.28− 0.98 0.79+0.20− 0.29 9.95+0.41− 0.54 8.80+0.24− 0.21 1.36+0.27− 0.32 0.87+0.27− 0.29 2.35+0.29− 0.24 0.17+0.46− 0.17
L14 0.20+0.48− 0.32 0.00+0.10− 0.08 0.01+0.45− 0.48 20.46+0.62− 0.64 −20.30+1.28− 1.02 0.83+0.21− 0.27 10.18+0.44− 0.57 9.00+0.28− 0.27 1.27+0.27− 0.33 0.84+0.28− 0.29 2.47+0.31− 0.23 0.34+0.47− 0.33
L15 0.13+0.46− 0.33 −0.06+0.10− 0.08 −0.37+0.43− 0.50 20.62+0.75− 0.67 −19.90+1.39− 1.25 0.80+0.24− 0.26 10.13+0.55− 0.70 9.03+0.31− 0.36 1.05+0.28− 0.37 0.64+0.28− 0.29 2.55+0.32− 0.24 0.40+0.47− 0.36
L16 −0.03+0.44− 0.37 −0.06+0.09− 0.08 −1.08+0.49− 0.33 20.95+0.82− 0.82 −18.78+1.08− 1.81 0.67+0.27− 0.18 9.61+0.85− 0.60 8.85+0.41− 0.43 0.79+0.29− 0.68 0.38+0.25− 0.41 2.56+0.27− 0.25 0.40+0.44− 0.36
L17 −0.34+0.40− 0.50 −0.08+0.05− 0.05 −1.69+0.47− 0.35 21.13+0.71− 0.79 −18.16+0.70− 1.56 0.59+0.19− 0.13 9.30+0.72− 0.32 8.75+0.37− 0.38 −0.19+0.93− 0.42 −0.16+0.54− 0.81 2.54+0.22− 0.21 0.42+0.35− 0.28
L18 −0.41+0.42− 0.55 −0.11+0.04− 0.04 −1.77+0.45− 0.53 20.36+0.79− 0.57 −19.00+1.23− 1.30 0.60+0.21− 0.17 9.73+0.58− 0.56 9.12+0.24− 0.38 −0.35+0.67− 0.28 −0.49+0.69− 0.63 2.64+0.20− 0.17 0.71+0.24− 0.35
L19 −0.31+0.30− 0.55 −0.14+0.03− 0.03 −1.54+0.42− 0.38 19.68+0.49− 0.47 −20.39+1.03− 0.78 0.73+0.22− 0.21 10.35+0.36− 0.44 9.46+0.18− 0.21 −0.21+0.44− 0.37 −0.29+0.48− 0.74 2.84+0.17− 0.16 0.96+0.04− 0.17
L20 −0.27+0.26− 0.42 −0.15+0.03− 0.03 −1.35+0.35− 0.31 19.41+0.50− 0.47 −21.32+0.73− 0.72 0.93+0.20− 0.20 10.77+0.31− 0.32 9.60+0.19− 0.21 −0.11+0.36− 0.44 −0.17+0.41− 0.62 3.18+0.18− 0.17 1.00+0.00− 0.04
Blob −0.56+0.38− 0.38 −0.14+0.03− 0.02 −1.68+0.35− 0.38 18.83+0.29− 0.31 −19.37+1.10− 1.08 0.21+0.24− 0.18 10.31+0.47− 0.68 10.06+0.38− 0.30 −0.40+0.47− 0.22 −0.48+0.53− 0.41 1.95+0.15− 0.11 0.00+0.48− 0.00
Red −0.27+0.24− 0.36 −0.16+0.02− 0.02 −1.45+0.11− 0.11 19.30+0.19− 0.20 −21.52+0.48− 0.45 0.96+0.10− 0.10 10.85+0.20− 0.20 9.65+0.10− 0.09 −0.16+0.30− 0.38 −0.24+0.38− 0.54 3.23+0.10− 0.09 1.00+0.00− 0.00
Spine                        

Note. The blob and red spine groups are detailed in Section 6.5.

Download table as:  ASCIITypeset image

Looking at the seven properties on which the WPCA was built, the most important feature is that they present the same shape as the PCi in which they have the greatest leverage. On the other hand, the properties not present in the WPCA show similar shapes or behaviors, depending on their individual correlations with the initial seven properties. Generally, their behavior (as a function of arc length) is defined by the four branches and three turning points, resembling in most cases a distorted W of the P-curve.

Thus, we can group all the properties with respect to which PCi they most resemble. For example, Figure 1 shows that log M*/Lr, log SFR/M*, u − r, and g − r have the greatest leverage in PC1 from the first four PCs. This makes these properties resemble the shape of PC1 in Figure 4, where it is mostly a linear relation with respect to arc length, with a scatter modulated by the turning points. Other properties that correlate with PC1 are r − i, gz Dn(4000), some Lick indices, and [O iii]. Note that [O iii] has a strong linear dependence on PC1 and behaves differently with the other emission lines (such as Balmer series) due to the higher ionization degree. The arc length also correlates linearly with eclass, which is the classification parameter derived in the PCA of galaxy spectra in Yip et al. (2004a), defined as a function of the expansion coefficients in the bases of the first two eigenspectra. In general, these properties can be expressed as a linear combination of each other, as seen in astrophysical use, e.g., M*/L = a × Color + b (e.g., Baldry et al. 2004).

In the same way, log SFR and μ50, r resemble strongly the "W" shape of PC2. Some emission lines belonging to this group are N ii and Hα, the latter being a well-known proxy for SFR (Kennicutt 1998). Also, Mr is a good proxy for log M* (Bell et al. 2003), both of which are related to R90r and μ*, r, and seemingly correlate with μ50, r. Interestingly, the shape of the average Lick Na D index is similar to the "W" shape of PC2 (and SFR), but the larger 1σ dispersion in the average makes it not very significant. Lick Na D is expected to represent a strong absorption feature in old stellar populations, as shown in the ellipticals at high arc lengths. We can, however, see that it presents also a relatively high average at the second turning point. This is related to the fact that Na absorption is not only present in stars, but also in the interstellar medium as a consequence of outflows or winds present in high star formation spiral galaxies (Chen et al. 2010).

Note that the boomerang shape of R90r/R50r is almost identical to PC3. Also correlating with the concentration index is fracDeVr (Stoughton et al. 2002), which determines the mixing in the modeling of the light profiles galaxies between an exponential disk and a de Vaucouleurs r1/4 law for the elliptical bulge.

6.2.1. The BPT Diagram

Figure 7 shows the emission-line ratios BPT diagram (Baldwin et al. 1981) of the MGS galaxies, where active galactic nucleus (AGN) identification can be done easily. We considered galaxies presenting Hα, Hβ, [N ii], and [O iii] emission lines with well-measured equivalent widths, of fractional errors smaller than 0.33 and velocity dispersion smaller than 500 km s−1. These cuts reduce the L1L15 groups to ∼90% of their size, going down to 35% for the remaining groups at higher arc length. Since none of these emission lines where included in the building of the WPCA, we overplotted the average location of the {Li}i = 20i = 1 groups.

Figure 7.

Figure 7. BPT diagram of the MGS galaxy sample. Symbols connected with straight lines track the positions of the average log [N ii]/Hα and log [O iii]/Hβ of each {Li}i = 20i = 1 (1σ dispersion bars also included). Colored dots are random 2% samples of each group. Dashed and dotted lines show the separators from Kauffmann et al. (2003a) and Kewley et al. (2001), respectively, between pure star-forming galaxies (left region), composite (central), and AGN (right).

Standard image High-resolution image

The average locations of the groups can be seen to be connected by a two-branched track in line-ratio space. The left branch covers the region of star-forming galaxies, whereas the right branch crosses the separator of Kauffmann et al. (2003a) into the region filled by AGNs. Interestingly, the joining point between the two branches happens at L14, which contains the second turning point in the principal curve. This is a striking feature, as it shows that the P-curve is powerful enough to describe galaxy properties beyond the ones included in its construction.

6.3. Luminosity Functions

Figure 8 shows the LFs corresponding to the {Li}i = 20i = 1 equal number density groups. They can be directly compared to the evolution of Mr as a function of arc length in Figure 6. The LFs were computed with the Vmax method of Schmidt (1968) explained in Section 5.1, where the estimated LF value at each magnitude bin is the sum of the weights of all galaxies in that bin, with wi = V−1max, i. Table 4 contains the fitting parameters, with the fitting functions as follows:

  • 1.  
    Double power law
    Equation (3)
    granted that 1 + ξ(L/L*) ⩾ 0. This double power-law fit (Alcaniz & Lima 2004) collapses when ξ = 0 into the Schechter fit (Schechter 1976) Φ(L)dL = ϕ*(L/L*)αexp (− L/L*)dL/L*. The ξ parameter can be related to the tail index in extreme value statistics (e.g., Gumbel 1958; Galambos 1978), defining for the brightest luminosities an infinite reaching power-law tail (ξ > 0), exponential tail (ξ = 0), or a cutoff at a finite maximum luminosity Lmax = L*/|ξ| (ξ < 0).
  • 2.  
    Lognormal
    Equation (4)
    Note that a lognormal distribution in luminosity space is equivalent as a normal-Gaussian distribution in magnitude space.
Figure 8.

Figure 8. Luminosity functions of the {Li}i = 20i = 1 groups ordered by increasing arc length (blue triangles with a dashed line fit from Table 4). The aggregated luminosity function of all the Li samples is shown as black circles with the Schechter fit as a black continuous line. The red diamonds denote the luminosity function belonging to the group of red galaxies located very close to the principal curve within the L20 group (see Section 6.5.2).

Standard image High-resolution image

Table 4. Luminosity Functions Fitting Parametersa

Group ϕ* × 103b M* α ξ
All 4.90 ± 0.14 −21.30 ± 0.03 −0.91 ± 0.02 0
L1 0.13 ± 0.02 −20.45 ± 0.12 −1.60 ± 0.05 0
L2 0.19 ± 0.03 −19.90 ± 0.13 −1.54 ± 0.07 0
L3 0.27 ± 0.08 −19.48 ± 0.21 −1.56 ± 0.13 0
L4 0.36 ± 0.08 −19.21 ± 0.15 −1.62 ± 0.10 0
L5 0.85 ± 0.21 −18.45 ± 0.18 −1.34 ± 0.18 0
L6 0.68 ± 0.19 −18.54 ± 0.17 −1.69 ± 0.15 0
L7 1.50 ± 0.10 −18.07 ± 0.07 −0.87 ± 0.10 0
L8 1.13 ± 0.09 −18.34 ± 0.11 −0.65 ± 0.16 0
L9 0.93 ± 0.05 −18.86 ± 0.06 −0.76 ± 0.07 0
L10 0.96 ± 0.04 −19.20 ± 0.06 −0.42 ± 0.08 0
L11 1.04 ± 0.01 −19.41 ± 0.04 0.03 ± 0.06 0
L12 1.00 ± 0.01 −19.85 ± 0.04 0.12 ± 0.06 0
L13 0.99 ± 0.01 −20.34 ± 0.02 0.06 ± 0.03 0
L14 0.95 ± 0.01 −20.68 ± 0.02 −0.07 ± 0.02 0
L15 0.69 ± 0.01 −20.92 ± 0.03 −0.46 ± 0.03 0
L16 0.16 ± 0.02 −22.01 ± 0.15 −1.16 ± 0.04 −0.32 ± 0.02
L17 0.016 ± 0.004 −23.15 ± 0.28 −1.69 ± 0.03 −1.167 ± 0.004
L18 0.41 ± 0.09 −20.69 ± 0.24 −0.93 ± 0.14 0
Group ϕ* × 104b μM σM ...
L19 8.52 ± 0.19 −20.46 ± 0.02 0.78 ± 0.02 ...
L20 9.00 ± 0.11 −21.34 ± 0.01 0.70 ± 0.01 ...
Red spine 0.55 ± 0.02 −21.51 ± 0.02 0.46 ± 0.01 ...

Notes. aFittings to Equations (3) and (4). The parameters in magnitude space can be expressed as luminosities using Mr = −2.5log10[L/L] + M☉, r, where M☉, r = 4.62 (Blanton et al. 2001). bϕ* in units of Mpc −3 mag −1.

Download table as:  ASCIITypeset image

The LF of the whole sample is well fitted by the Schechter fit, except for the bumps at the high-luminosity tail and at the low-luminosity end starting at Mr ∼ −18.8, also observed by Blanton et al. (2003, 2005). We fitted it right before the bump.

The changes in the behavior of the LF along the P-curve are determined also by the three turning points. In summary, M* becomes fainter in the first branch, and then brighter afterward. On the other hand, the slope of the faint tail behaves similarly to PC2. In fact, it is steep in the first branch, becomes shallower in the second branch, then again steep in the third branch. The fourth branch contains an extremely shallow slope, where the LFs resemble mostly a lognormal distribution. We can see that we recover the LF shapes shown in Binggeli et al. (1988), which are based on morphological types and vary between Schechter fits (as in L1 to L15) and bell-shaped LFs fitted by lognormal fits (such as the L19 and L20).

As we progress along the first branch of the principal curve (L1 to L6), the BCDs become less luminous on average, having M* dimmer in about ΔM* ≃ 2, with a mostly constant steep faint end (α ∼ −1.55). Note that these galaxies, and mostly the low surface brightness spirals at the first turning point, create the bump seen in the overall LF. In fact, L6 contains the faintest galaxies in our sample (Figure 6). At this point, α = −1.69 gives the steepest power-law slope at the faint luminosity tail. Note that this slope is expected to reach α ≃ −1.5, as noted in Blanton et al. (2005).

In the second branch (L7 to L13), the star-forming spirals present a faint end that flattens dramatically and starts to drop continuously, with an increase of Δα ∼ 1.6 from L7. At the same time, they start becoming much more luminous, with M* brightening in ΔM* ∼ −2.2 (from L6).

In the third branch (L14 to L17), for the red spirals and lenticulars M* continues becoming brighter (ΔM* ∼ −2.5), but at the same time the faint-end slope starts becoming steep again (Δα ∼ −1.6), back to the values of α ∼ −1.6 found at the end of the first branch. Note that L16 (green maximum) and L17 (red maximum) show long power-law faint ends with a sharp cutoff at the bright end. They are better fitted by a double power-law fit, and since ξ < 0 they present bright-end finite cuts at Mr ∼ −23.0 and −23.3, respectively.

The third turning point (L17) is a unique case. The LF presents three power-law-like sections, the faintest one being flat. We attempted to fit it with a Schechter profile.

In the last two groups (L19 and L20), the faint-end tail has dropped enormously. We attempted a lognormal fit for the luminosities, since the LFs look more bell shaped, especially the ones belonging to the {λi}i = 5i = 1 groups that track the spine.

6.4. Galaxy Clustering

In this section, we investigate the second moment of the galaxy distribution as a function of the arc length, the spatial distribution, quantified by the clustering. We explore here not only the dependence of the galaxy clustering as a function of L, but also the relative distribution of galaxies as a function of L, which can be quantified by the cross-correlation function.

Following common practice, we compute first the redshift space correlation function as a function of the distances parallel (π) and perpendicular (rp) to the line of sight. We use a generalized version (Szapudi & Szalay 1998) of the Landy & Szalay (1993) estimator

Equation (5)

where the subscripts a and b refer to the two samples we are considering when measuring the cross-correlation function. We use the same methods as Heinis et al. (2009) to compute the correlation functions. In brief, for each sample we generate random catalogs following the SDSS footprint defined by its sectors. We use 50 times more random objects than galaxies. We reproduce the selection function by randomly drawing redshifts from the current sample. We correct from fiber collision using the method described in Heinis et al. (2009). Note that the fiber collision correction applies only to the DaDb term in Equation (5).

As ξ(rp, π) is sensitive to redshift distortions, we consider the projected spatial correlation function afterward, which is free from such effects:

Equation (6)

where we use πmax = 25 Mpc for convergence purposes.

We compute error bars on wp(rp) from jackknife resampling. We build jackknife samples using the SDSS stripes, which are defined to be 2fdg5 wide great circles on the sky, following the survey latitude. In practice, we consider 23 jackknife samples built from the stripes.

We use a volume-limited sample extracted from our main sample (see Section 2). In order to maximize the signal ratio of the clustering measurements, we do not use 20 samples in L, but 8 of them built the following way: We collided L1 to L6 in two groups of a similar number of galaxies (Lw1 and Lw2), L7 to L10 in one group (Lw3), and the remaining L samples two by two (Lw4 = {L11, L12}, Lw5 = {L13, L14}, Lw6 = {L15, L16}, Lw7 = {L17, L18}, and Lw8 = {L19, L20}).

Figure 9 shows the results for the autocorrelation functions of these samples in the diagonal plots. As a reference, we show as a solid line in all diagonal plots the autocorrelation function of the sample with the highest arc length (Lw8). The results from the autocorrelation function show that the amplitude of the correlation function at large scales (rp ∼ 10 Mpc) increases with the arc length. This implies that the host halo mass also increases with l. This result is expected as l does correlate with u − r and g − r colors, for instance. It is indeed well known that the amplitude of the correlation function increases for redder objects in the local universe (e.g., Zehavi et al. 2005, 2011). There are also some interesting features in the small scales (rp < 0.1 Mpc) clustering. Indeed, most samples show clustering power at all scales, except our groups 2 and 3 in particular, where there is an indication of a lack of pairs at rp < 0.1 Mpc, suggesting a population mainly composed of central galaxies.

Figure 9.

Figure 9. Left: the diagonal panels show the autocorrelation function of the {Lwi} groups. The solid black line is the autocorrelation function of Lw8 shown for reference. Off-diagonal panels show the cross-correlation functions between the groups, where the red solid line represents the expected cross-correlation function when the galaxies from the two samples are well mixed in the dark matter halos. Right: same layout as on the left, but showing the ratio between the measured cross-correlation function to the expected one when the galaxies from the two samples are well mixed in the dark matter halos.

Standard image High-resolution image

In the off-diagonal plots, we show the cross-correlation functions between these samples. It is beyond the scope of this paper to fully interpret all these measurements with Halo Occupation Distribution models (e.g., Cooray & Sheth 2002). We will use simple arguments to highlight the information contained in the cross-correlation function.

We represent as a solid line in all off-diagonal plots the expected cross-correlation function given by

Equation (7)

where wa(rp) and wb(rp) are the autocorrelation functions of samples a and b. Equation (7) gives the cross-correlation function that is expected in the case where galaxies from the two samples are well mixed in the dark matter halos hosting them (see, e.g., Zehavi et al. 2005). This is of interest at small scales, where the cross-correlation function contains information about close pairs of galaxies that lie within the same dark matter halos. Our results show an interesting trend in this context. Indeed, the cross-correlation function of galaxies with close arc lengths is similar to the expected cross-correlation function from Equation (7). On the other hand, the cross correlation of galaxies more distant in terms of arc length diverges from the expected correlation function. In particular, the measures are overestimated by Equation (7) at scales rp < 0.1 Mpc. This means that there are less close galaxy pairs in the measures than what is expected in the case of a perfect mix between galaxies of different arc length. Note that there is still a clustering signal at these scales, which means that there are some galaxies belonging to distant L groups in the same dark matter halos. However, our results show that there are fewer pairs than what is expected if galaxies are properly mixed. This suggests that the probability that two galaxies reside in the same dark matter halo decreases as a function of their distance in arc length.

6.5. Interesting Galaxy Groups Found from WPCA and Principal Curve Classification

The analysis of galaxy properties with WPCA and P-curve methods allowed us to find and pinpoint some relevant groups that stand aside from the main trends of the whole galaxy sample. In particular, here we pay attention to the small blob of galaxies that clusters apart from the main cloud in Figure 2. Another interesting task is to isolate a pure population of red galaxies in the L20 group whose LF is lognormal, as it appears in Figure 8.

6.5.1. Blob of Small Red Disk Galaxies/Lenticulars of High M*/Lr and μ*, 50

In Figure 2 we found a small blob of galaxies clustered apart from the main cloud in the PC3 vs. PC4 panel. We separated them using the following separating line (built by eye): PC4 ⩽ −1.3 + 0.55(PC3 − 2.0). We further constrained these galaxies by choosing an appropriate interval in arc length [lmin, lmax] = [11.6, 12.6], which includes most of the L18 group and part of the L19 group. In fact, the blob appears also in Figure 6 bracketed within this arc-length range in the R90r/R50r, μ*, r, and log R90r panels. Precisely, R90r/R50r is the property that dominates in PC3 and PC4 with opposite signs, as shown in Figure 1. After the selection cuts, we are left with 136 blob galaxies, the imaging and average spectrum of which are shown in Figure 10.

Figure 10.

Figure 10. Equivalent to Figure 5, but showing interesting groups derived from WPCA and P-curve analysis. Top: panels with the four most representative galaxy shapes for the small red disk-like galaxies blob (left) and red spine galaxies (right). Bottom: average spectrum for the two previous groups.

Standard image High-resolution image

According to Table 3, the blob is composed mostly of small red disk galaxies, with minimal star formation and void of gas. In fact, u − r is at least 1σ redder than the average in the L18 group. Furthermore, they are modeled with an important component of an exponential disk (fracDeVr = 0.00+0.48− 0.00). The concentration index R90r/R50r ≃ 1.95 is low and far from the often used R90r/R50r = 2.6 separator between ellipticals and spiral galaxies (Strateva et al. 2001). The average size of R90r ≃ 1.6 kpc is small compared to the one of L18 (R90r ≃ 4.0 kpc), lying beyond the 1σ significance level. Interestingly, the small size makes M*/Lr ≃ 4.07 ML−1☉, r, μ50, r ≃ 18.83 mag arcsec2, and μ*, r = 1010.06M kpc−2 to be also well above 1σ of their average at L18. With respect to the uncertainty of these values, errors in the estimate of M* can arise when fitting the spectra, especially when there is a strong component of dust, which these galaxies appear to have. The statistical error in M* is around 15%. When comparing to a catalog of groups from Tago et al. (2010), we found that at least half of these galaxies are in groups of more than 10 members. We speculate that they were depleted of gas by ram pressure stripping or another mechanism that did not perturb the structure of the disk.

6.5.2. Close-to-Spine Red Galaxies of Lognormal Luminosity Function

In Figure 8 we show that the L20 group of red galaxies has an LF close to a lognormal distribution, clearly different from the gamma and double power-law distributions of the other groups. For further study, we wanted to isolate the galaxies in this group whose LF is exactly lognormal.

In order to extract these galaxies, we chose the ones falling in {λ98, λ99, λ100}, which are the last three subgroups of L20 as explained in Section 5.3. We further chose the first partition closest to the P-curve, out of the 10 radial partitions in each λi group, selecting therefore galaxies on or very close to the spine of the data point cloud.

The selected red spine galaxies are part of the very high density core of the red sequence of galaxies found in L20. In fact, Table 3 and Figure 6 show that these red spine galaxies have average property values very close to the averages of the properties of the whole L20 galaxy group. They are mostly red ellipticals (ur ≃ 2.62, fracDeVr ∼ 1, and R90r/R50r ≃ 3.23) of mass M* ≃ 7.08 × 1010M and luminosity Lr ≃ 2.86 × 1010L☉, r. Figure 8 shows that the LF of the close-to-spine red galaxies is in fact very close to a lognormal distribution, with fitting parameters shown in Table 4.

7. DISCUSSION

The unsupervised nonparametric methods of WPCA and P-curve should not be considered useful only for dimensionality reduction and easy data visualization. In this paper, these methods also proved able to provide supporting evidence for some physical models and scenarios relevant in extragalactic astronomy, as discussed next.

7.1. Information Content of the Principal Curve

The principal curve provides an objective way to order galaxies along its arc length. The success in dimensionality reduction and classification power of the P-curve is related to how much the projections of the galaxies are spread along arc length. In fact, a large variance along arc length gives more room for building separators and discerning different galaxy types. In our case, the arc-length values along the P-curve have a variance of σ2l = 7.79 (see Table 2), which is bigger than cumulative variance Σi = 4i = 1σ2PCi = 6.81 of the principal components from which it was built. This means that the curvature of the P-curve helps to discern information not included in the intrinsic linearity of WPCA. The length of the P-curve cannot be made arbitrarily long or short, due to an evident bias-variance trade-off. The shortest curves (with no curvature) are identical to PC1, with an average spread across it equal to Σi = pi = 2σ2PCi, and a high bias due to the straight P-curve missing the important bends in the structure of the cloud of points. On the other hand, the longest curves possible would be the ones connecting all the data points, which produce a null bias but high variance, as the curve will fit the noise in the structure of the cloud. In fact, as we experimented with values of df ≃ 7, the curve attempts to cover all the space spanned by the cloud, twisting and coiling itself in ways that describe additional detailed features of galaxies, while we are now interested in the global trends. Our election of df = 5.4 is an intermediate case, where the root mean square of the projection distances on the curve is d = 1.31, smaller than (Σi = 4i = 2σ2PCi)1/2 = 1.52. The ratio 1σ2l/d2 = 0.75 gives us a notion of the amount of information that the P-curve is able to discern. The physical origin in the scatter of the remaining 25% is still to be explored and depends locally on the direction in the eigenspace along which d is measured.

7.2. Explaining the Zoo of Galaxies

In our analysis, the P-curve has been able to recover the well-known bimodality between the blue and red populations. Since galaxy properties are highly correlated, only a few properties should be enough to explain the variations in the zoo of galaxies, namely u − r (from PC1), SFR (from PC2), and less importantly R90r/R50r (from PC3). In fact, the variations recovered by the P-curve and its "W" shape depend strongly on SFR. The color u − r, almost linearly correlated with arc length, tracks specifically the 4000 Å break in the continuum, which gives a measure of stellar age and separates the early to late galaxy types. However, this is not enough, as R90r/R50r tells about morphology, whereas the SFR tracks the amount of material produced in recent star bursts, which shows as the strength of emission lines such as the Balmer series and eventually correlates with the galaxy size, mass, and luminosity. For example, within the blue population (bluer than the green maximum) we can find the low star formation and surface brightness spiral galaxies separating the bluest star-forming spheroidals/irregulars and the redder star-forming spirals with a prominent bulge. The importance of the color and emission lines from star formation in explaining variations in galaxy populations has also appeared in previous PCA studies (e.g., Yip et al. 2004a; Coppa et al. 2011; Győry et al. 2011).

7.3. Additional Evidence Supporting Some Physical Models

7.3.1. AGN Activity and Star Formation Quenching

The P-curve presents a green density maximum between the blue and red ones. The green maximum shows an interesting feature in PC1 (Figure 4), colors, and M*/Lr as a function of arc length (Figure 6). The average of these properties keeps increasing as the arc length increases, except at the green maximum in L16, where they stay constant or even decrease in value. This behavior, however, is not seen, for example, in SFR/M*, whose average continues decreasing monotonically at L16. Note that L16 is the last group that shows significant star formation and/or emission lines (see Figure 5). Indeed, the equivalent width of Hα drops by 1 dex (see Table 2) when moving to L17; this last group is consistent with no Hα emission given the 1 Å resolution of SDSS spectra. Furthermore, L16 is the last group in the pure star-forming region branch on the BPT diagram (Figure 7), right before the bordering region separating the pure star-forming and composite regions. Thus, higher arc-length groups have basically small to null star formation activity and contain AGN. This is in agreement with the findings that AGN activity might be the cause for the shutdown of star formation in these galaxies (Martin et al. 2007).

7.3.2. Hierarchical Model of Galaxy Formation

The LFs shown in Figure 8 can be classified into roughly gamma and lognormal distributions. It has been shown, e.g., in Cooray & Milosavljevic (2005) and Yang et al. (2009), that the LF of the Schechter fit for LFs can be divided into several components, coming from two different populations in dark matter halos: central or BCGs and satellite galaxies. Satellite galaxies are often given a power-law LF, with a finite cut at the bright end given by the luminosities of the central galaxies. The centrals, on the other hand, are given bell-shaped LFs. In particular, high mass halos (Mh > 1013M, according to Cooray & Milosavljevic 2005) contain central galaxies whose LFs can be modeled as a lognormal distribution. This is exactly the behavior observed for the L19 and L20 groups in Figure 8, and it is better seen for the red spine galaxies shown in Section 6.5.2. Note that L19 and L20 correspond to Lw8 in Figure 9, which appears to have an autocorrelation function with stronger power at rp = 10 Mpc than in any other group. On the other hand, at rp ≲ 0.1 Mpc there is a clear loss of power. This is consistent with L19 and L20 being mostly composed by central galaxies. Note that there is still power in the autocorrelation function of Lw8 at rp < 0.1 Mpc, which shows that there are some satellite galaxies (mostly red) in this sample, which is consistent with the faint-end tail of the LFs in Figure 8.

Lognormal distributions appear in nature as a consequence of multiplicative processes (Limpert et al. 2001; Mitzenbacher 2003, and references therein) where the initial value Y0 of a random variable is changed in successive steps in the form Yj = FjYj − 1 by i.i.d. multiplicative factors Fj of distribution P(F). Using the central limit theorem for j, it can be shown that Y follows a lognormal distribution independent of P(F). This argument can be extended to explain the lognormal LFs of central galaxies and their stellar mass functions as well, since the r-band luminosity traces a population of old stars that form the bulk of the mass of galaxies (Bell et al. 2003). In fact, hierarchical galaxy formation models (e.g., Steinmetz & Navarro 2002; De Lucia & Blaizot 2007) explain the creation of massive elliptical BCGs as a series of dry mergers of existing galaxies. Thus, a dense environment will allow several steps of mass adding or stripping that might lead to the formation of BCGs and cause the lognormal mass distributions for them.

M.T.P. acknowledges the use of the VO spectrum service for averaging galaxy spectra (Dobos et al. 2004) and thanks Ching-Wa Yip, Mark Neyrinck, Timothy Heckman, and Sean Moran for useful discussion. The authors thank the anonymous referee for advice directed at enhancing the impact of this paper.

Funding for SDSS-III has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, and the U.S. Department of Energy Office of Science. The SDSS-III Web site is http://www.sdss3.org/.

SDSS-III is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS-III Collaboration including the University of Arizona, the Brazilian Participation Group, Brookhaven National Laboratory, University of Cambridge, Carnegie Mellon University, University of Florida, the French Participation Group, the German Participation Group, Harvard University, the Instituto de Astrofisica de Canarias, the Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins University, Lawrence Berkeley National Laboratory, Max Planck Institute for Astrophysics, New Mexico State University, New York University, Ohio State University, Pennsylvania State University, University of Portsmouth, Princeton University, the Spanish Participation Group, University of Tokyo, University of Utah, Vanderbilt University, University of Virginia, University of Washington, and Yale University.

Footnotes

Please wait… references are loading.
10.1088/0004-637X/755/2/143