Galaxy Zoo: Morphological Classification of Galaxy Images from the Illustris Simulation

Modern large-scale cosmological simulations model the universe with increasing sophistication and at higher spatial and temporal resolutions. These ongoing enhancements permit increasingly detailed comparisons between the simulation outputs and real observational data. Recent projects such as Illustris are capable of producing simulated images that are designed to be comparable to those obtained from local surveys. This paper tests the degree to which Illustris achieves this goal across a diverse population of galaxies using visual morphologies derived from Galaxy Zoo citizen scientists. Morphological classifications provided by these volunteers for simulated galaxies are compared with similar data for a compatible sample of images drawn from the Sloan Digital Sky Survey (SDSS) Legacy Survey. This paper investigates how simple morphological characterization by human volunteers asked to distinguish smooth from featured systems differs between simulated and real galaxy images. Significant differences are identified, which are most likely due to the limited resolution of the simulation, but which could be revealing real differences in the dynamical evolution of populations of galaxies in the real and model universes. Specifically, for stellar masses , a substantially larger proportion of Illustris galaxies that exhibit disk-like morphology or visible substructure, relative to their SDSS counterparts. Toward higher masses, the visual morphologies for simulated and observed galaxies converge and exhibit similar distributions. The stellar mass threshold indicated by this divergent behavior confirms recent works using parametric measures of morphology from Illustris simulated images. When , the Illustris data set contains substantially fewer galaxies that classifiers regard as unambiguously featured. In combination, these results suggest that comparison between the detailed properties of observed and simulated galaxies, even when limited to reasonably massive systems, may be misleading.


INTRODUCTION
As large-scale simulations of the universe increase in size and in resolution, increasingly sophisticated comparisons with observations are becoming more feasible.While early work concentrated on matching features of the universe captured by simple parameterizations such as the mass function or scaling relations (e.g.Kauffmann et al. 1993;Cole et al. 1994), modern cosmological simulations produce galaxies with apparently realistic star formation histories, substructures, and colors (e.g.Genel et al. 2014;Crain et al. 2015;Kaviraj et al. 2017).The prospect of "observing" this simulated universe via the creation of artificial images offers the chance to test any such simulation's fidelity, and any discrepancies may provide new insights on the physics that drives galaxy formation and evolution.
The obvious comparison for simulations that model the present-day galaxy population is the Sloan Digital Sky Survey (SDSS; York et al. (2000); Strauss et al. (2002)), which has provided a wealth of information about a large number of local systems (see Strateva et al. 2001;Kauffmann et al. 2003;Tremonti et al. 2004;Brinchmann et al. 2004;Baldry et al. 2004, for just some of the most highly cited results).The SDSS augments its galaxy catalogs with a rich suite of spectral, photometric, and instrumental metadata.In particular, the availability of estimated galaxy redshifts and stellar masses is critical for our analysis.
Modern simulations such as Illustris (Vogelsberger et al. 2014a,b;Genel et al. 2014;Sijacki et al. 2015) have been used to construct simulated versions of the SDSS (Torrey et al. 2015), and comparisons between observed and simulated universes have utilized a large range of parameters derived from observations (Snyder et al. 2015;Bottrell et al. 2017a,b).However, much insight can still be gained by relying on morphological classification of galaxy images.Morphology is a sensitive probe of a galaxy's dynamical and star formation histories, and such classifications have been shown to reflect differences between systems that are often difficult to recover from purely parametric approaches (e.g.Bamford et al. 2009;Schawinski et al. 2009;Masters et al. 2010a), and have also helped to unveil previously unnoticed trends and behaviors (e.g.Schawinski et al. 2010;Masters et al. 2011;Simmons et al. 2013;Casteels et al. 2013;Galloway et al. 2015;Smethurst et al. 2016;Kaviraj 2014).
This paper uses visual morphological classifications as a metric for comparison between simulated and observed universes.Using calibrated citizen science data from the Galaxy Zoo project (Lintott et al. 2008;Willett et al. 2013), we provide non-parametric labels for a large number of simulated galaxies and compare these to SDSS galaxies labeled in the same way.In this manner, we aim to investigate the degree to which large cosmological simulations, and specifically Illustris, can claim to match the present-day galaxy population.

DATA 2.1. The Illustris Sample
Illustris is a suite of large volume, cosmological hydrodynamical simulations run with the moving-mesh code Arepo (Springel 2010;Genel et al. 2014).It includes a comprehensive set of physical models that are deemed critical for modeling the formation and evolution of galaxies across cosmic time.Galaxy formation processes in Illustris are simulated following the models described by Vogelsberger et al. (2013) and Torrey et al. (2014).Each of the Illustris simulations encompasses a volume of 106.5 Mpc 3 and self-consistently evolves five different types of resolution element (dark matter particles, gas cells, passive gas tracers, particles that represent stars and their stellar winds, and supermassive black holes) from a starting redshift of z = 127 to the present day, z = 0.The Illustris simulation suite successfully reproduces a range of well established galaxy scaling relations.It implements a unique combination of high-resolution and total simulation volume, which provides an ideal test dataset for our purposes.
The Illustris image sample is generated using an ensemble of 6891 unique subhaloes that had assembled within the Illustris simulation volume by z = 0.Each subhalo is assumed to represent a single galaxy.These were chosen to have M 10 10 M1 , which corresponds to a typical number of stellar particles 10 5 .Simulated galaxies comprised of fewer particles were deemed unlikely to accurately represent morphological features of interest (e.g.Torrey et al. 2015), and were therefore excluded from our sample.
We use images from Torrey et al. (2015), which have been processed as described in Snyder et al. (2015) to produce 'observationally realistic' images.This process produces synthetic Illustris images that are square arrays with side length 424 pixels, with a typical angular pixel scale 0. 05 − 0. 10 per pixel.For each image, the precise pixel scaling is adjusted to ensure that the central 2 3 of each subject image corresponds to twice the simulated galaxy's projected Petrosian radius.This scaling emulates the approach used to generate the original Galaxy Zoo 2 subject images.Each image is convolved with a nominal PSF with Full Width at Half Maximum (FWHM) ∼ 1. 0, which is similar to the ∼ 1. 4 average seeing for the SDSS DR7; the two sets of images should be broadly comparable.It should be noted that these images represent a simulation of galaxies that have evolved until redshift zero, but projected as if they lie at z = 0.05.We expect little evolution in the galaxy population between z = 0.05 and the present, and so this displacement should not significantly affect the comparison we wish to make.Observational evidence also indicates that galaxy populations in the real universe exhibit little evolution in this redshift interval (e.g.Rudnick et al. 2003;Blanton et al. 2003).
Images of each galaxy were generated for four orientations that model observation from the separate vertices of a tetrahedron with the subhalo at the center (the tetrahedron is oriented with respect to the simulation and so randomly relative to the galaxy).Backgrounds are randomly selected from real SDSS images.The 'target' galaxy is assumed to be in the foreground and in rare cases may be superimposed over systems that are actually closer than the projected distance of the simulated galaxy (z = 0.05).Four separate backgrounds for each galaxy were used to mitigate this and other systematic effects.The final sample that is potentially available for classification therefore comprises a total of 16 images per subhalo, making a total of 110,256 distinct subjects.

The SDSS Sample
To provide a valid comparison for the Illustris sample, described in §2.1, we begin by selecting SDSS galaxies with M > 10 10 M and with redshifts between z = 0.045 and z = 0.055.
The left-hand panel of Figure 1 shows the stellar mass 2 distributions of the raw, redshift-selected SDSS and Illustris datasets.The distributions are obviously mismatched due to a combination of the a-priori galaxy mass selection applied to the Illustris sample and incomplete sampling of faint, lowmass galaxies in the SDSS.
Within the narrow redshift range spanned by our SDSS sample, the inferred stellar mass provides a good proxy for galactic size and luminosity, which are both likely to influence the observability of morphological features.We therefore use bootstrap resampling to construct a final SDSS sample with a mass distribution that matches the Illustris sample that was ultimately classified (see §3).The SDSS sample is drawn from 100 bins, equally separated in log-mass space.The right-hand panel of Figure 1 illustrates the resulting distribution in M of our bootstrap-resampled SDSS dataset.This dataset contains 7159 entries, of which 5556 are unique.Among those remaining images that are sampled repeatedly, the vast majority are pairs; very few images appear more than twice.
For reference, Figure 2 compares mass-matched, but otherwise randomly selected images from the Illustris and SDSS subject sets.

Predictable differences between the Illustris and SDSS images
Several assumptions and simplifications were adopted when generating synthetic galaxy images based on the Illustris simulation data.Accordingly, some predictable differences between simulated and real images are inevitable, and we outline the most significant of these here.Intrinsic dust reddening was not considered when generating synthetic images based upon the simulated Illustris galaxy structures.Dust formation occurs in dense molecular clouds, which are 2 Stellar masses for the SDSS galaxies were derived from the P97P5 column of the MPA-JHU catalog (Brinchmann et al. 2004).
not fully resolved at the ∼ 1 kpc spatial resolution that Illustris achieves, so modeling of the dust within simulated galaxies requires augmentation of the simulation output with a number of ad-hoc assumptions3 .In contrast, the threedimensional positions of the Illustris galaxies' stellar populations are directly resolved by the simulation.Accordingly, synthetic images that omit dust modeling provide a faithful representation of the raw simulation output, which ultimately simplifies inference of the performance of Illustris using visual classification data.Nonetheless, dust obscuration is known to be significant for some local galaxies (e.g.Masters et al. 2010b), and this omission is manifested in Figure 3 as clear mismatches between the distributions of absolute magnitude for the five SDSS filters (u, g, r, i, z) between the Illustris and resampled SDSS samples that worsens for increasingly blue filters.
In addition, Snyder et al. (2015) note that the sizes of simulated and real galaxies (measured by the half-mass radius for Illustris and Petrosian 50% radius for SDSS) are comparable at masses of 10 11 and above, but at lower M the Illustris galaxies are comparatively more extended.The discrepancy amounts to a factor of two at a mass of 10 10 M .

GALAXY ZOO CLASSIFICATION INFRASTRUCTURE
Galaxy Zoo is a set of citizen science projects that have collectively engaged hundreds of thousands of volunteers in the classification of galaxy images drawn from large groundbased surveys and from those conducted by the Hubble Space Telescope (Lintott et al. 2008;Fortson et al. 2012).Such classifications have been shown to be a good match to expert classifications (Lintott et al. 2008;Willett et al. 2013;Simmons et al. 2017;Willett et al. 2017).Moreover, the degree of consistency between the classifications provided by multiple volunteers for the same galaxy image provides a measure of the precision of their aggregate classification.
Classification of a galaxy image in Galaxy Zoo entails answering a series of questions, each evaluating a particular aspect of a galaxy's morphological appearance.The earliest questions segregate the subject set into broad morphological categories before subsequent questions investigate increasingly intricate aspects of a galaxy's appearance.The full question set is subjected to hierarchical filtering such that questions are only asked if they remain pertinent following earlier responses.Accordingly, sampling becomes increasingly sparse for questions that appear later in the classification hierarchy and the degree of statistical uncertainty associated with each subject's consensus response increases.For this project, Illustris images were classified via a decision tree emulating the tree used for the Galaxy Zoo 2 project and described in Willett et al. (2013).In Galaxy Zoo, each galaxy image is classified by at least forty 4 nominally independent volunteers.The individual responses to each question are then aggregated to yield an overall consensus classification.For questions that require a binary response, the availability of multiple independent responses permits the aggregate classification to be encapsulated as a real-valued vote fraction, which is evaluated as the ratio of the number of positive (or negative) responses to the total number of responses.
The Illustris classifications used for this study were accumulated via the Galaxy Zoo web-based interface between 2015 September and 2017 August.During this interval, 164,627 volunteers contributed 814,283 morphological assessments for 20248 distinct galaxy images.Classification began with an initial subject set comprising 17046 images for simulated galaxies with stellar masses 10 ≤ log 10 (M /M ) ≤ 13 The initial sample was designed to facilitate the assessment of potential systematic biases that were anticipated but were not ultimately evident during analysis.To isolate the effect of background and viewing angle on morphological classification, a subset of 10832 images were derived from 677 distinct subhaloes that were selected by uniform random sampling from within two narrow ranges of total halo mass 10.5 ≤ log 10 (M halo /M ) ≤ 11, 12.5 ≤ log 10 (M halo /M ) ≤ 13.Each subhalo was imaged from the four directions corresponding with the vertices of a regular tetrahedron and superimposed over four randomly selected background images per vertex, as described in §2.1.The remaining 6214 images sample the complementary ranges of halo mass, facilitating mass-independent morphological 4 The mean number of classifications per subject is 40.2.comparison with observed SDSS galaxies.Each synthetic image in this subset corresponds to a distinct subhalo, viewed from a single, randomly selected viewing angle and superimposed over a single randomly selected background.. To enhance the sample of classifications for the most massive Illustris galaxies, the initial set was subsequently augmented with 3202 additional images for which the corresponding stellar masses exceeded 10 10.5 M .
For our SDSS sample, we use data from Galaxy Zoo 2 (Willett et al. 2013), which provides detailed morphological classifications of nearly 250,000 galaxies drawn from the 7th SDSS data release (Abazajian et al. 2009).The subset of the SDSS used for Galaxy Zoo 2 is described by Willett et al. (2013) and was further subsampled to provide a comparison dataset for the Illustris images and their corresponding morphologies.

RESULTS
We identify discrepancies between the Galaxy Zoo classifications that were obtained for the Illustris dataset and those obtained for a redshift-and mass-matched sample of SDSS galaxies by comparing the distributions of vote fractions obtained for each sample.For this investigation, we concentrate on the first, most fundamental question in the Galaxy Zoo 2 decision tree, which distinguished galaxies with featurespredominately disk-dominated systems -from those where no such features are apparent.Even this crude distinction reflects significant differences in the underlying dynamical and star formation history of a galaxy, which dictate its visual morphology.Accordingly, it is an excellent test of the realism of the images produced by the Illustris simulation.Figure 4 illustrates the unweighted5 vote fraction distributions for the response "disk or features" to the question "Is the galaxy simply smooth and rounded, with no sign of a disk?" (hereafter f 1→F ) for the Illustris and SDSS samples6 .
Consequently, a high value of f 1→F implies that the imaged galaxy probably has features, while f 1→F → 0 implies the converse.A surprisingly marked disparity is evident.The SDSS galaxies show a broadly bimodal distribution, with many (visibly featureless) systems clustered around low featured vote fractions, and a smaller number of systems that have high vote fractions.The SDSS distribution arises primarily from genuine morphological separation between elliptical and spiral systems but is augmented at low f SDSS 1→F by votes for "artifact" were discarded when computing the vote fractions we present in this paper.We verified that omitting artifact votes from our analysis does not qualitatively affect our results.galaxies that would exhibit features but are too faint for any intrinsic substructure to be visible in subject images.The Illustris sample, by contrast, is characterized by a prevalence of galaxies with visible substructure, which is evident in Figure 4 as a dominant peak around a modal vote fraction of around 0.6.It is clear from even this simple comparison that there are significant differences between the two samples.
In Figure 5 we subdivide the Illustris and SDSS samples into disjoint subsamples according to galaxy stellar mass, M .For 10 ≤ log(M /M ) ≤ 10.5, the mismatch between the distributions of f 1→F that was evident for the full range of galaxy masses is qualitatively reproduced.For subsamples that correspond to higher stellar masses, the f 1→F distributions become increasingly similar, and for M 10 11 M , we see a significant fraction of galaxies in the Illustris sample with low vote fractions as expected from SDSS observations.We verified that the observed overabundance of featured galaxies in Illustris is not an artifact of viewing angle by individually analyzing four subsets of images corresponding to the distinct vertices of the tetrahedral imaging structure described in §2.1 and verifying that qualitatively similar vote fraction distributions are obtained.We also verified that the observed dependence on M is preserved for each subset of the data.
The other notable difference between the two samples is manifested for M ≥ 10 10.5 M as a significant subset of SDSS galaxies with very high featured vote fractions (f 1→F 0.85).A population of galaxies that almost all classifiers identify as spiral in the SDSS is either missing in the simulated universe or classified differently in the Illustris sample.Figure 6 shows representative samples of galaxy images drawn from the mismatching region of (M −f 1→F ) parameter space for the Illustris (left-hand columns) and SDSS (right-hand columns) datasets.While Illustris does produce a population of featured galaxies with M ≥ 10 10.5 M , the SDSS image sample appears to include a larger fraction of nearby grand design spirals that the majority of volunteers would classify as obviously featured.In contrast, the Illustris galaxy images appear slightly more ambiguous, with less prominent disks, and it seems plausible that the apparent deficiency of galaxies that are unanimously perceived as featured reflects this ambiguity.
The intentional omission of dust modeling when generating the synthetic Illustris images (see §2.Comparison between the normalized distributions for f Illus 1→F and f SDSS 1→F corresponding to the full Illustris and SDSS samples, respectively.A high value of f1→F implies that the majority of volunteers discerned discrete substructure in the galaxy image, while f1→F → 0 implies the converse.While the SDSS distribution is dominated by systems with low f SDSS 1→F , the Illustris sample apparently contains many more galaxies that exhibit visible substructure and yield more intermediate vote fractions.5.The f1→F vote fractions in intervals of log(M /M ).Proper interpretation of f1→F is explained in the main text as well as in the caption of Figure 4. Below log(M /M ) ∼ 11, the SDSS and Illustris f1→F distributions match very poorly.At higher masses, overall agreement between the distributions is substantially improved, albeit with a residual discrepancy between the numbers of obviously featured galaxies.fications.To illustrate how intrinsic dust extinction affects the classifications that are gathered for real galaxy images, Figure 7 plots featured vote fraction distributions for disjoint subsets of the SDSS sample that were segregated based upon the observed axial ratio (B/A) SDSS between the projected semi-minor (B) and semi-major (A) axes of each galaxy7 .Remarkable differences between the four distributions are  evident with volunteers labeling many more featured galaxies as the typical axial ratio for each subset increases from zero (edge-on) to unity (face-on).This phenomenon is likely dual in origin.Intrinsic dust extinction within the target galaxy may obscure discernible features while superimposed substructures along the line of sight may lead them to appear as a single luminous mass.Focusing on structurally disk-like galaxies, small values of (B/A) SDSS suggest that the target was observed with an edge-on orientation.This configuration increases the probability of discrete substructures occupying nearby sightlines and becoming visually indistinguishable.Moreover, escaping starlight that would reveal such features must traverse a much larger column of dust on average without being absorbed in order to reach the observer.Conversely, as (B/A) SDSS → 1, galaxies with face-on orientations predominate and discrete substructures become more visible.
The procedure used to generate the Illustris subject images did not model dust extinction, and we show the normalized featured vote fraction distribution for the full Illustris sample in all four panels of Figure 7.The Illustris and SDSS distributions do not coincide well for any of the (B/A) SDSS ranges considered.For (B/A) SDSS 0.25, the disparity is clearly manifested as an excess of apparently featured galax-ies among the Illustris sample.It is plausible that the galaxies contributing to this excess would shift to lower f 1→F if dust attenuation were properly simulated when preparing the Illustris subject images.Such migration might dilute or even eliminate the apparent morphological disparities between the two samples.

SUMMARY AND CONCLUSIONS
We have used visual classifications from Galaxy Zoo to compare the coarse morphological appearance of simulated galaxies from the Illustris cosmological simulation with those of a population drawn from the Sloan Digital Sky Survey, matched in mass and redshift.This set of visual classifications allows a direct comparison to be made with observations, with any differences indicating potentially missing physics in the simulation, the inevitably limited resolution of such simulations, or the choices made in producing 'observationally realistic' images.In any case, understanding how selection by morphology might influence comparisons between simulation and observation is essential.
Figure 4 reveals two marked disparities between the two samples.The fraction (f Illus 1→F ) of classifiers who report noticeable features in Illustris galaxy images exceeds that for the equivalent quantity (f SDSS 1→F ) for classifications of SDSS 1→F for disjoint subsets of the SDSS sample (green) that were segregated based upon the observed axial ratio (B/A)SDSS between the projected semiminor (B) and semimajor (A) axes of each galaxy.Proper interpretation of f1→F is explained in the main text as well as in the caption of Figure 4.Many more galaxies exhibit visible features (attain high f SDSS 1→F ) as (B/A)SDSS increases from zero to unity.For comparison, the distribution of f Illus 1→F (blue line) for the entire Illustris sample is shown in all panels.).For the SDSS distribution shown in the upper-left panel (smallest (B/A)SDSS, most edge-on), no galaxies were unambiguously classified as featured or smooth.This indicates that edge-on galaxies may be particularly difficult to separate base on visual inspection.subjects.Indeed, Figure 5 illustrates that for log(M /M ) < 10.5 the distributions of f Illus 1→F and f SDSS 1→F are almost mirror images of each other.While volunteer classifiers clearly discern features in a large majority of the Illustris sample, a far smaller proportion report them for the SDSS galaxy images.There is also a small set of galaxies with high featured vote fractions in SDSS but this population is absent in Illustris.While the Illustris images are simulated to an observational resolution of 1 compared to an achieved average seeing of 1.4 for the SDSS, this small difference is unlikely to be responsible for such a large observed difference.
The absence of moderate and high-mass, unambiguously featured galaxies in the Illustris sample that was noted in §4 is perhaps the most surprising result.It may represent the response of volunteer classifiers to simulated objects, which, despite the care taken in preparing the images, are often eas-ily distinguished from their SDSS counterparts.Features such as bright knots, over-prominent arms, and so on are seen in many Illustris images.These artifacts are the result of insufficient particle resolution and may confuse classifiers, reducing the consensus on features.Alternatively, it may be that the simulation is failing to producing realistic grand design spirals.
We also see a failure to produce the correct fraction of smooth galaxies.The importance of this mismatch between the Illustris and SDSS samples appears to depend strongly on the stellar mass range of the galaxies under consideration.Figure 5 plots analogues of Figure 4 for mass-selected subsets of the Illustris and SDSS.It is apparent that the distributions of f Illus 1→F and f SDSS 1→F become markedly less disparate for stellar masses M > 10 11 M .However, correspondence between the two datasets remains imperfect, and a population of highly featured galaxies that are present in the real universe, but absent in Illustris becomes apparent above M > 10 10.5 M .
The underproduction of unambiguously featured galaxies with large M that we identify in Illustris may indicate that accumulation of stellar mass involves simulated processes that also disrupt or destroy spatially discrete substructures.The most massive galaxies in Illustris are predominantly formed by the hierarchical assembly of smaller systems (Rodriguez-Gomez et al. 2016).Repeated interactions between simulated galaxies provide a plausible mechanism for suppression of visible features.To investigate this possibility, we searched for indications that the time since the most recent major merging event in a simulated galaxy's history predicts its morphological classification for galaxies with M > 10 11 M .No compelling correlations were observed.The two-sample Kolmogorov-Smirnoff test yields a p-value of 0.104 when comparing the distributions of the time since the most recent major merging event for subsamples of visually smooth (f Illus 1→F < 0.3) and featured (f Illus 1→F > 0.85) galaxies.This is consistent with both subsamples being drawn from the same parent distribution.We also checked for a significant correlation between the fraction of galactic stellar mass that was formed in-situ and the visibility of features in the Illustris galaxy images.In this case, the two-sample Kolmogorov-Smirnoff test yields a p-value of 4.1 × 10 −7 when comparing the samples of smooth and featured galaxies.This result indicates that f Illus 1→F ≥ 0.85 comprise a larger proportion of stars that were formed in-situ, which is broadly supportive of the hypothesis that visually featured galaxies experienced comparatively fewer interactions during their formation.A more rigorous verification that accumulation of ex-situ stellar mass is indeed responsible for the disruption of visually apparent substructures would require detailed examination of each galaxy's assembly history, which is beyond the scope of this paper.
Given that the ability of a simulation to represent a galaxy depends coarsely on the number of particles used to model it, some mass dependence should be expected; indeed, this is why galaxies with stellar masses less than 10 10 M were excluded from the study.Such differences have been seen before, in particular by Bottrell et al. (2017a) who showed that a threshold at M > 10 11 M also emerges when attempting morphological classification using parametric fits to the galaxy's light profile.Below this critical mass, the simulation produces a large proportion of disk-dominated galaxies; we confirm this result and show that it has a significant effect not only on the parametric measurements but on the overall visual morphology of the system being studied.In some cases, non-parametric morphological metrics for Illustris galaxies also appear to differ from those of their physical counterparts when M 10 11 M .For example, Bignone et al. (2017) show that the measured asymmetry of merging Illustris galaxies appears artificially large in comparison with mass-matched observational samples.In the same mass range, Snyder et al. (2015) identify a peculiar population of galaxies that exhibit distinctive ring-like structures of enhanced star formation, resulting in unexpectedly extended morphologies (examples of several such systems are included in Figure 2).Snyder et al. (2015) suggest that these ringlike structures may reflect an imperfect model for coupling between feedback mechanisms and the interstellar medium (ISM) in Illustris galaxies.Alternatively, the rings of star formation may be an inherent manifestation of the ISM equation of state that is assumed for the Illustris simulation.Earlier studies (e.g.Hambleton et al. 2011) compared the properties of simulated galaxy samples with those of locally observed systems using non-parametric morphological estimators.Similar discrepancies pertaining to excessive asymmetry and clumpy substructure were identified.
As in Illustris a galaxy's stellar mass broadly maps to the number of stellar particles comprising the simulated galaxy, we conclude that below 10 11 M , the number of stellar particles comprising a galaxy is apparently insufficient to represent the simulated physics reliably, and observed structures are often likely to result from resolution-induced artifacts.The effects are subtle, and the images produced by the simulation are clearly perceived as realistic, but as a population there remain differences between simulated and observed galaxies.These differences complicate more detailed comparisons between the Illustris and SDSS galaxy morphologies.Below M ∼ 10 11 M , the coarse morphological differences between observed and simulated galaxies could artificially distort the later stages of classification, because early volunteer responses restrict the set of questions that are subsequently posed.For the most massive galaxies, a limited number of subject images results in excessively sparse sampling of the Galaxy Zoo classification hierarchy that prevents reliable inference of morphological characteristics.Future studies that match SDSS and Illustris samples should be aware of the 10 11 M threshold we have identified and its effects on the comparison being made.We have also shown that insight can be derived from visual analysis of large samples of images derived from simulations and recommend this procedure for future data products.

Figure 1 .
Figure 1.Raw (left) and resampled (right) stellar mass distributions for the Illustris (blue hollow) and SDSS (green filled) datasets.Distributions are shown for the inferred stellar mass within 97.5% (left) of the galaxy's Petrosian radius.

Figure 3 .
Figure3.Illustration of the mismatch between the distributions of absolute magnitude for the 5 SDSS filters (u, g, r, i, z) for the Illustris and resampled SDSS datasets.
Figure 4. Comparison between the normalized distributions for f Illus1→F and f SDSS 1→F corresponding to the full Illustris and SDSS samples, respectively.A high value of f1→F implies that the majority of volunteers discerned discrete substructure in the galaxy image, while f1→F → 0 implies the converse.While the SDSS distribution is dominated by systems with low f SDSS 1→F , the Illustris sample apparently contains many more galaxies that exhibit visible substructure and yield more intermediate vote fractions.

Figure 7 .
Figure 7. Distributions of f SDSS1→F for disjoint subsets of the SDSS sample (green) that were segregated based upon the observed axial ratio (B/A)SDSS between the projected semiminor (B) and semimajor (A) axes of each galaxy.Proper interpretation of f1→F is explained in the main text as well as in the caption of Figure4.Many more galaxies exhibit visible features (attain high f SDSS 1→F ) as (B/A)SDSS increases from zero to unity.For comparison, the distribution of f Illus 1→F (blue line) for the entire Illustris sample is shown in all panels.).For the SDSS distribution shown in the upper-left panel (smallest (B/A)SDSS, most edge-on), no galaxies were unambiguously classified as featured or smooth.This indicates that edge-on galaxies may be particularly difficult to separate base on visual inspection.