walter: A Tool for Predicting Resolved Stellar Population Observations with Applications to the Roman Space Telescope

Studies of resolved stellar populations in the Milky Way and nearby galaxies reveal an amazingly detailed and clear picture of galaxy evolution. Within the Local Group, the ability to probe the stellar populations of small and large galaxies opens up the possibility of exploring key questions such as the nature of dark matter, the detailed formation history of different galaxy components, and the role of accretion in galactic formation. Upcoming wide-field surveys promise to extend this ability to all galaxies within 10~Mpc, drastically increasing our capability to decipher galaxy evolution and enabling statistical studies of galaxies' stellar populations. To facilitate the optimum use of these upcoming capabilities we develop a simple formalism to predict the density of resolved stars for an observation of a stellar population at fixed surface brightness and population parameters. We provide an interface to calculate all quantities of interest to this formalism via a public release of the code: \texttt{walter}. This code enables calculation of (i) the expected number density of detected stars, (ii) the exposure time needed to reach certain population features, such as the horizontal branch, and (iii) an estimate of the crowding limit, among other features. These calculations will be very useful for planning surveys with NASA's upcoming Nancy Grace Roman Space Telescope (Roman, formerly WFIRST), which we use for example calculations throughout this work.


INTRODUCTION
The Nancy Grace Roman Space Telescope (Roman) has the potential to revolutionize the study of stellar populations in nearby galaxies. Roman is a NASA mission currently under construction and scheduled for launch in 2025. It is a 2.4m telescope designed to cover and Section 1.2 before describing the purposes of this work in Section 1.3.

Recent Resolved Stellar Populations Studies
Resolved stellar photometry provides a highly sensitive probe for several fundamental astrophysical processes (e.g., Dalcanton et al. 2012). This includes the study of field dwarf galaxies, dwarf satellite galaxies, galaxy stellar halos, and disk formation. In particular, these examples can shed light on the formation of galaxies and the distribution of dark matter. The ages of the stellar populations of field dwarfs probe the earliest epochs of galaxy formation (e.g., Weisz et al. 2015;Fillingham et al. 2018;Sacchi et al. 2021) as well as the epoch of reionization (e.g., Simon et al. 2021).
The dwarf satellite mass function's sensitivity to the epoch of reionization makes these low-mass galaxies significant for cosmological constraints (e.g., Bullock et al. 2000;Graus et al. 2016). The constraining power of galaxy stellar halos has been demonstrated by Bullock & Johnston (2005), who performed a suite of simulations showing the wide array of structures expected for a variety of galaxy formation histories. Low surface brightness stellar halos provide the strongest known constraints on the accretion histories of galaxies (Bullock & Johnston 2005;Cooper et al. 2010;Pillepich et al. 2014;D'Souza & Bell 2018) and are unique probes of the structure and substructure of the dark matter halos that surround them (Johnston et al. 1999;Carlberg 2009).
Observational work has already begun to address some of the cosmic mysteries outlined above. In the Local Group, large new imaging surveys have uncovered dozens of new satellite galaxies and streams around the MW and M31 (e.g., Martin et al. 2013;The DES Collaboration et al. 2015;McConnachie et al. 2018;Malhan et al. 2018, and references therein). This work revealed abundant substructure in halos and discovered dozens of very low mass dwarf galaxies, providing dramatic confirmation of the widely accepted hierarchical galaxy formation paradigm (at least in part; Bell et al. 2008). This sparked a vigorous theoretical exploration of how the galaxies' assemblies and mass distributions can be accurately measured from the remnants of disrupted satellites (Price-Whelan et al. 2014;Sanderson et al. 2015;Belokurov et al. 2018;Helmi et al. 2018;Lancaster et al. 2019;Reino et al. 2020). The Panchromatic Hubble Andromeda Treasury (PHAT; Dalcanton et al. 2012) survey observations have revealed detailed insight about the formation and evolution of stars and disk galaxies (e.g, Rosenfield et al. 2012;Williams et al. 2012;Lewis et al. 2015;Weisz et al. 2015;Johnson et al. 2016;Williams et al. 2017) by resolving more than 100 million stars in the disk of M31 (Williams et al. 2014). Outside the Local Group, integrated light and star-counting observations have allowed course mapping of stellar halos as well as the discovery of significant substructures and satellites (e.g., Martínez-Delgado et al. 2010;van Dokkum et al. 2014;Sand et al. 2014;Crnojević et al. 2016;Carlin et al. 2016a;Mao et al. 2021). However, due to their faintness and the limited FoV of previous instruments, the resolved populations of stellar halos and low-mass dwarf galaxies have so far only been mapped in detail in a small handful of galaxies (e.g., Newberg et al. 2002;Majewski et al. 2003;Calchi Novati et al. 2005;Ibata et al. 2007;McConnachie et al. 2009;Crnojević et al. 2016;Bennet et al. 2019;Smercina et al. 2020).

Potential Future Impact
These insights have tremendous potential, but with relevant observations available for only a handful of galaxies in a narrow range of environments, definitive empirical conclusions on galaxy assembly writ large and the distribution of dark matter within galaxies remains out of reach. Resolved stellar photometry from Roman will have the potential to discover faint dwarf galaxy satellites in stellar halos (see e.g., Crnojević et al. 2019), adding significantly to the number of galaxies with wellmeasured satellite mass functions that can be directly compared to simulated galaxy samples (e.g., Garrison-Kimmel et al. 2014;Kelley et al. 2019;Buck et al. 2019;Jiang et al. 2021;Font et al. 2021). Roman will allow comparisons to be made in new environments for fainter satellite systems beyond the Local Group. Moreover, the dark matter halos of dwarf galaxies are predicted to be populated by a large number of dark matter clumps (Gao et al. 2004). Some of these dwarfs may host their own faint satellite galaxies (e.g., Carlin et al. 2016bCarlin et al. , 2020 and streams (see e.g., Martínez-Delgado et al. 2012, Starkenburg et al., in prep.), making the surroundings of any galaxy (regardless of its mass) a prosperous ground for the discovery of new faint structures (Sales et al. 2013;Wheeler et al. 2015;Pardy et al. 2020). The statistics of such satellite counts, from dwarf to massive galaxies, can be increased by orders of magnitude with Roman.
Beyond intact satellite galaxies, the physical debris from dwarf galaxies merging with larger halos looks dramatically different if the dwarfs were accreted early/late, on orbits of low/high eccentricity, and if they were of high or low luminosity (e.g., Hendel & Johnston 2015). The observed properties of substructure can be associated with fundamental physical quantities: the frequency of tidal debris reflects the recent accretion rate; the physical scales and surface brightnesses reflect the mass and luminosity functions of infalling objects; and the morphology reflects the orbits. Thus, substructure in halos offers a direct constraint on the history and nature of baryonic and dark matter assembly (Johnston et al. 2008).
In addition to formation processes, tidal debris probes the dark matter properties and distribution around the Milky Way (e.g., Price-Whelan & Bonaca 2018a;Koposov et al. 2010;Küpper et al. 2015) and external galaxies (e.g., Fardal et al. 2013;Pearson et al. 2022b). Complexity arises beyond the galaxy accretion history because the morphology and frequency of debris structures can be affected by the three dimensional shape of dark matter halos (certain orbit families can cause "fanning" of thin tidal streams, see Pearson et al. 2015;Fardal et al. 2015;Price-Whelan et al. 2016;Yavetz et al. 2021). These "complications" leave observable signatures in the morphology of streams alone and hence offer the additional possibilities of measuring the orbit distribution within dark matter halos, and possibly the halos' triaxiality. Additionally, because gaps can form in streams as they interact with dark matter subhalos (e.g., Ibata et al. 2002;Johnston et al. 2002;Yoon et al. 2011;Carlberg 2012), particularly thin streams, emerging from globular clusters, are sensitive to low mass subhalos and can be used to indirectly probe the nature of dark matter (e.g., Bovy et al. 2017;Price-Whelan & Bonaca 2018b;Bonaca et al. 2019). Pearson et al. (2019Pearson et al. ( , 2022a showed that Roman will be able to detect such thin streams in galaxies out to distances beyond 3.5 Mpc. This provides exciting prospects for future searches for gaps in streams orbiting galaxies that do not host molecular clouds, galactic bars or spiral arms which can contaminate the gap signatures from dark matter subhalos (Amorisco et al. 2016;Erkal et al. 2017;Pearson et al. 2017;Banik & Bovy 2019).

This Work
Capitalizing on these capabilities will require the community to carefully plan observations to efficiently attack the scientific goals of interest. With the aim of providing better tools for planning observations, we develop a general formalism for calculating the number of detected stars per unit sky area in a given observation, and at what point that observation will become crowding limited. This formalism simplifies the estimation of observational sensitivity to surface brightness and stellar population parameters. The formalism itself is also entirely general, and applicable to any observation aiming to resolve a population of stars, not necessarily using Roman.
For the convenience of the future use of this formalism, herein we describe and provide access and installation instructions for walter (named in honor of Walter Baade, the first astronomer to resolve M31 in to stars (Baade 1944)), a code to calculate quantities necessary for planning stellar populations surveys. These quantities include the number of stars an observation will detect in a pointing, covering objects from nearby dwarf galaxies that fit entirely within a field to very extended and faint stellar halos of large galaxies. The user can make calculations for populations of a wide range of ages and metallicities. The code is available on GitHub, and only depends on numpy, matplotlib, iPython, and scipy (Harris et al. 2020;Hunter 2007;Perez & Granger 2007;Virtanen et al. 2020). A faster version of the code used to compute the quantities laid out in this paper also requires Cython, though this is not necessary for simpler applications (Behnel et al. 2011).
The paper is structured as follows: in Section 2, we describe our mathematical formalism. In Section 3, we provide example calculations of the quantities laid out in Section 2. These calculations are also carried out and described in the accompanying code. In Section 4 we give a comparison of the predictions of our code against observational data. Finally, in Section 5, we provide concrete examples of how our code could be used to help plan observational campaigns and we give a brief conclusion in Section 6.

FORMALISM
In this Section we explain our approach to calculating the number of stars that are expected to be resolved in a given observation. Additionally, we calculate the point at which an observation would become too crowded to accurately measure the magnitude of all stars that we would like to resolve. To simplify the discussion in this section, we restrict ourselves to single values for the following quantities, which will be the key variables determining an observation: τ -The stellar age of the population.  B 5σ -The 5σ limiting apparent magnitude in band B for (isolated) point source detection at an exposure time of 1 hour. This is a property of the telescope/instrument being used.
This approach allows us to keep the discussion general and therefore apply the same formalism to any number of different types of resolved stellar population observations, from Milky Way globular clusters to stellar halos of distant galaxies. With this general structure in mind, we proceed by first calculating the expected density of resolved stars per unit sky area for a given observation in Section 2.1. We then calculate the regime in which an observation will become crowding limited in Section 2.2. Finally, we discuss some of the caveats of this work in Section 2.3.

Density of Detected Stars
We calculate the number of resolved/detected stars per unit sky area, n det , in a given observation defined by the variables specified above: τ , [Fe/H], d, t exp , B, Σ B , and B 5σ (a parameter of the instrument). We do this by separately calculating (i) the total number-density of stars per unit sky area (detected and undetected), n tot , and (ii) the number fraction of stars that are detected, f det . We can then calculate n det as n det = n tot f det .
These quantities are intrinsically stochastic, via the stochastic process of star formation. We parameterize this stochasticity by the initial mass function (IMF) of the population, which we denote by ξ(m). We assume that the IMF is fully sampled and use ξ(m) to calculate expectation values for the quantities of interest. We assume that ξ(m) is normalized so that it integrates to one, this assumption is important for the correctness of our formulae below. The IMF can be thought of as an independent characteristic of the population. In our applications below we make the assumption of a Kroupa IMF (Kroupa 2001), but for now we keep the discussion general. We note that throughout this formalism we refer to the mass m of a star, by which we always mean its initial mass, as it is on the Zero-Age Main Sequence (ZAMS).
We first aim to calculate the total number density of stars (detected and undetected) in a given observation, n tot , which depends on all quantities except for the exposure time. First, we translate the IMF in to a luminosity function for the population, Φ(L), using an isochrone, for a given age and metallicity, taken from a stellar evolution code. We denote the mapping of an initial mass to its luminosity in a given band, B, for a given age and metallicity, τ and [Fe/H], as L B,τ,[Fe/H] (m). For brevity of notation we simply write this function as L B , where the dependence of the population parameters is implicit.
The fact that this mapping (m → L B ) is not unique (stars at different masses can have the same luminosity) and therefore non-invertible is where most of the difficulty of calculating n det comes from. This complexity means that we cannot write down a simple expression for the luminosity function, even if our expression for the IMF is quite simple 2 .
To calculate n tot we only need Σ B , d, and the average luminosity of the population in band B, L B , which can be written as: (1) Again, the dependence of this quantity on population parameters is implicit. This quantity is the expectation value of the L B map on the space of initial masses. If we first define the distance modulus of the population µ ≡ 25 + 5 log 10 (d/1 Mpc) then we can write n tot as: Next we wish to calculate the number fraction of the population that we actually observe, f det . The distance to the population and the exposure time of the observation only contribute to the calculation of f det by determining the absolute magnitude at which we can no longer resolve stars, which we write as B cut (for an observation in band B), where µ is again the distance modulus and B 5σ is the apparent magnitude limit for a 1 hour exposure, as defined above. Reference values for B 5σ in each Roman band were obtained from the Roman Wide Field Instrument technical specifications webpage 3 and are listed in Table 1. We then define the set of all luminosities brighter than B cut as L Bcut . With these definitions in mind f det can be calculated as: is the set of all initial stellar masses m that are brighter than B cut at age τ and metallicity [Fe/H] 4 . To be explicit with the dependencies of the Figure 1. The conversion factors between surface brightness in the F 158 band and several other bands, indicated by color, as a function of metallicity of the population being observed. We show the conversion factors at an age of τ = 2.8 and 11.2 Gyr as dashed and solid lines respectively. These conversion factors can be strong functions of metallicity but are relatively independent of age, especially for the redder bands.
. This is the way that we calculate f det as well (in terms of the limiting absolute magnitude) which we then translate in to dependencies on d and t exp using Equation 3 for our calculations.
Putting Equation 2 and Equation 4 together, we then have the formula for the number of detected stars per unit sky area, n det , as: where all of the complexity of stellar populations is now folded in to the quantities L B and f det . Below, we review how f det and L B depend on their various parameters for the Roman bands. However, first, we will include the effects of crowding in this formalism.

Crowding Limits
Another important aspect to consider for resolved stellar populations is the crowding limit. This is the surface brightness at which the density of stars on the sky is high enough that it affects the ability to precisely measure the photometry of individual stars. To quantify the effects of the various parameters introduced in the previous section we follow the formalism of Olsen et al. (2003). For a given photometric band B, this work calculates the surface brightness at which stars of magnitude M lim can no longer be measured with photometric precision σ m (measured in magnitudes) as: where Σ m is the apparent surface brightness of the population in magnitudes per square arcesecond, L lim is the luminosity corresponding to M lim , µ is the distance modulus, A res is the angular scale of the resolution element or PSF in square arceseconds, and L 2 B L<L lim is the expectation value of the square of the luminosity over stars less luminous than L lim defined as: where L L lim denotes the set of luminosities dimmer than L lim and L −1 B (L L lim ) denotes the set of all initial stellar masses m that are dimmer than M lim at age τ and metallicity [Fe/H] 5 . Note that we have changed the formalism of Olsen et al. (2003) to work in terms of the IMF, ξ(m) instead of the luminosity function Φ(L).
Since we hope to resolve stars up to some limiting magnitude, which we described in the last section as B cut , for the photometric band B, it is natural to set M lim = B cut to calculate the crowding limited surface brightness of most interest.
We note that this gives a single surface brightness value at which an observation becomes crowding limited. In reality, a single observation has varying surface brightness over the field of view. This means that some parts of an observation may be crowding limited while others might not.

Caveats of our Formalism
While the formalism laid out above allows the treatment to remain general, it also means that the formalism does not apply to any real observation. For example, we assume that the stellar population being observed has a single age and metallicity. In principle, this can be extended by taking a linear combination of the results from single populations in terms of some metallicity/age distribution. However, given our formalism as stated above, this would require knowing the surface brightness of each single-age, single-metallicity component of the distribution. The simpler quantity to work with here would be stellar mass surface density Σ * , which could then be straightforwardly and self-consistently translated in to a surface brightness of each component of Figure 2. The evolution of the fraction of the stellar population that is detected in the F 158 band as a function of distance for a texp = 10 3 seconds (solid lines) and texp = 10 4 seconds (dashed lines) exposures with Roman and how this varies with the stellar population parameters (τ in top panels and [Fe/H] in bottom panels). For each panel we also indicate the Main Sequence Turn Off (MSTO), Horizontal Branch (HB), and Tip of the Red Giant Branch (TRGB) according to the point at which they fall below the 5σ point source detection apparent magnitude. These each correspond to noticeable drops in f det . Some main take-aways that are apparent in this figure: (i) as a population ages, it dims and, even at fixed surface brightness, we detect a smaller fraction of the populations, (ii) the TRGB can be seen out to 35 Mpc for 10 4 s exposures and populations of interest to halo studies, and (iii) metallicity mainly has an effect on the observability of the RGB, not other parts of the population. that age and metallicity. We leave the implementation of this functionality to later updates of the code.
Another side-effect of taking Σ B as our fundamental parameter is that, when performing calculations for multiple bands, it is left up to the user to make sure that the surface brightness in the bands are consistent with one another. As a guide for the case of the Roman bands, we provide the necessary conversions between the surface brightness in the F 158 band and any other band as a function of metallicity in Figure 1 for two different ages. We also give numeric values in Table 1 for a popu-lation of age τ = 11.2 Gyr and metallicity [Fe/H] = −2. Though, as we show in Figure 1, these conversion factors can be strongly dependent on the metallicity of the population being observed. However, they are less sensitive to the age (at least for not-newly-born populations), especially in the redder bands. The conversion factor between two arbitrary bands B and C is straightforward to calculate from our code as Figure 3. The evolution of the total number density of stars, both detected and undetected, with distance. The panels correspond to two different Roman bands which are indicated in the bottom right corner of each panel. We pick three assumed surface brightnesses, spanning from values typical of the outer disks of galaxies, where crowding can begin to dominate (24 mag/arcsec 2 ) to the predicted values for faint streams in galaxy halos (32 mag/arcsec 2 , see Figure 6). The number density is most strongly dependent on the surface brightness of the population, which is indicated by numbers at the right hand side of each panel in units of AB magnitudes per square arcsecond. The number density is also dependent on the metallicity of the population being considered (which is indicated by the curve colors) and the age of the population, the solid lines indicate an 11 Gyr old population while the dot-dashed lines indicate a 4 Gyr old population. We emphasize the bluest (R062) and reddest (F 184) bands to show the extreme ends of this evolution. Generally, ntot ∝ d 2 and increases with age as a population dims (since ntot ∝ LB −1 ).
where L B is given by Equation 1 for an assumed population age and metallicity. This should be useful for the further application of this code to convert between surface brightnesses observed from ground based observatories to other bands in which one is hoping to resolve the stellar population. Additionally, our approach is analytic, assuming that the underlying stellar population being observed exactly follows a continuous distribution in mass, ξ(m). This should produce accurate results when the full population being observed fully samples the relevant mass range of the IMF. However, what this mass range is depends on the stellar population properties themselves 6 . Generally this will be true if the inferred total number of stars in the population N tot = n tot ∆Ω (where ∆Ω is the area over which the observation is carried out) is large ( 10 4 ). N tot can be directly calculated from our code and used as a guide on this front. 6 For example, at old stellar population ages, only initial stellar masses 2 M contribute to the luminosity of the population There is an additional source of stochasticity when it comes to only detecting the very brightest members of a stellar population. For most reasonable ξ(m), there are only a few stars at the very brightest magnitudes at any age, representing a very small range in mass and thus a very small fraction of the population. This means that while N tot may be large f det is very small when only considering the very brightest stars (see Figure 2) so that the total number of detected stars N det = N tot f det is still small. This is especially worrisome since different stellar evolution codes will give slightly different predictions for f det at these very low values, meaning that N det can easily go from 10 to 2 when using different codes. In general, if the predicted N det of walter is 100, we advise caution on the strict interpretation of the results.
Taking in to account the assumption of a constant surface brightness is more subtle. In principle, one should be able to break any real observation in to chunks of approximately constant surface brightness, and apply these results individually to each chunk. However, this can be difficult when the surface brightness varies greatly (as in the presence of star clusters) or when Figure 4. The evolution of the number density of detected stars with distance for a 10 3 s (solid lines) and 10 4 s (dashed lines) exposures using the Padova isochrones. We stop plotting the curves once f det < 10 −6 . As in Figure 3, the panels correspond to different Roman bands which are indicated in the top right corner of each panel. The number density is most strongly dependent on the surface brightness of the population, which is indicated by numbers at the left hand side of each panel in units of AB magnitudes per square arcsecond. The density is also dependent on the metallicity of the population being considered (which is indicated by the curve colors), and slightly with the age of the population, though we only show a τ = 11 Gyr old population here. There are two interesting summary points: (i) The number of detected stars per unit sky area is roughly constant once the Horizontal Branch is no longer detectable as ntot ∝ d 2 and f det ∝ d −2 (ii) the redder bands, such as F 184 here, are extremely helpful in detecting parts of the RGB out to very great distances.
the scale on which the surface brightness is constant becomes comparable to the scale on which the IMF is poorly sampled (where our population-averaged formalism would not robustly apply). We do not address this issue here, though it should only be a problem for particularly close-by populations, where the total stellar mass is 10 3 M over several pixels.
Our formalism does not take in to account crowding by Milky Way foreground stars and background galaxies. As discussed in Appendix A of Jang et al. (2020) this can result in a number density of spurious sources of ∼ 3 × 10 −4 arcsec −2 for color-magnitude cuts typical of metal-poor RGB stars; somewhat higher background densities, up to ×10 −3 arcsec −2 , are possible if somewhat wider color and magnitude ranges are considered. The observations of Jang et al. (2020) had t exp = 10 3 − 10 4 secs and were made with the Hubble Space Telescope (HST). As we will show below, this can dominate over the expected number density of detected stars for surface brightness values of interest. This will also effect the crowding and photometric reduction. Therefore, the effect of background sources is an important factor that needs to be taken in to account when applying this formalism to real observations. We leave this for future work.
There are other effects that we do not account for here. For example, the effects of interstellar extinction, which should be small for the infrared wavelengths probed by Roman, could be important for the application of our formalism to other observations. Our formalism also does not address the presence of binary star systems in a population, or binary evolution. This should affect the predictions of this formalism, though likely only by a small factor. Finally, we treat the completeness in detection as going from 100% at magnitudes brighter than B cut to 0% at magnitudes fainter than B cut . In reality this will be a smooth transition and could be an important effect in amplifying the number of stars we expect to detect. We leave addressing these issues to future work. Finally, we assume that the distance, d, to the stellar population being observed is constant over the observation. While this is usually a safe assumption for extra-galactic observations, it may not be when considering populations in the nearby universe. In principle, this could be addressed in the same way as age and metallicity, using some distribution in distance. Figure 5. The evolution of the number density of detected stars with exposure time, texp, for a theoretical observation of a population with a surface brightness Σ = 32 mag arcesec −2 , an age of τ = 11.2 Gyr, and a metallicity of [Fe/H] = −2, at a distance of d = 10 Mpc. Note, realistically, the same population would not have the same surface brightness in each band. We plot each curve up until the crowding limit is reached for the individual bands, indicated by the circular points. We can see that, for such a low surface brightness, this crowding limit is only reached after a very long integration in most bands. Figures such as this can be very helpful when deciding how to effectively use exposure time. For example, we can see a drastic increase in n det for the F 062 band between 200 and 1000 seconds, with not nearly as much return on invested exposure time afterwards.

EXAMPLE CALCULATIONS: RESOLVING POPULATIONS WITH ROMAN
For the convenience of the future use of this formalism, walter is accompanied by example notebooks to walk through the use of the code and how to plot certain quantities. In this section, we explain some of the examples given in the code. These calculations are intended to give the reader an intuitive understanding of the basic dependencies and the order of magnitude values, they are not meant to be exhaustive. For all presented examples, we work under the assumption that our observing instrument is the upcoming Roman Space Telescope. In the code, we point out where changes would need to be made if the code were to be applied more broadly.
To apply the formalism developed in Section 2 we must choose a mapping from initial stellar mass to luminosity in a given band and for given population parameters, or L B (m), which can be provided by any stellar evolution code. We use two different stellar evolution codes both to give an idea of the uncertainties associated with differences in stellar evolution calculations and because each code has its own benefits and drawbacks. The first code we use is the Modules for Experiments in Stellar Astrophysics (MESA) Isochrones and Stellar Tracks (MIST) code (Dotter 2016;Choi et al. 2016) and the second is the PARSEC set of isochrones (Bressan et al. 2012;Marigo et al. 2017). Specifically, we download isochrones for stellar populations from metallicity of 0.5 to −3.25 in increments of 0.25 dex and ages from log 10 (τ /yrs) = 8.95 to 10.1 in increments of 0.05 dex 7 . Each isochrone spans a range of initial masses from roughly 0.1M to 300M and provides a numerical mapping between these initial masses in each photometric band of interest. The photometric bands of interest to us here are the proposed bands for Roman, namely F 062, F 087, F 106, F 129, F 146, F 158, and F 184. The Padova isochrones also provide predictions for the most recently added filter to the Roman mission, F 213. Some general information on these bands can be found in Table 1. One thing to keep in mind is that all of our formalism is in terms of AB magnitudes whereas the isochrones compute all quantities in Vega magnitudes. For this reason we provide the conversion between the two for each band in Table 1.
To perform all integrals related to Equations 1, 4, and 7, we use the mass samples of the isochrones to integrate over, linearly interpolating between these samples where appropriate cuts must be made. As mentioned earlier, we use a Kroupa IMF with masses limited to being between 0.08M and 120M with ξ ∝ m −1.3 for 0.08M ≤ m ≤ 0.5M and ξ ∝ m −2.3 for 0.5M ≤ m ≤ 120M . The code to do all of these integrals is provided with the GitHub repository. We also provide an implementation of these integrals in Cython, which considerably speeds up their calculation.

Calculating f det
We begin by illustrating the fraction of all stars in a specified stellar population detectable with a given observation, f det , and its dependence on the parameters of the observation. As stellar populations generally dim as they age, we expect f det to decrease with age, and the dependence on t exp and d are given through the limiting magnitude as stated in Equation 3, however the dependence on metallicity is less clear. To make f det more transparent we illustrate its evolution in distance and for several different values of metallicity and age of the observed stellar population in Figure 2 for exposure times of t exp = 10 3 and 10 4 seconds using the MIST isochrones. We additionally indicate several evolutionary stages in Figure 2 by associating each stage with the distance at which they are just barely detectable. We do this by associating an absolute magnitude to each stage by evaluating the brightest magnitude within ranges of Equivalent Evolutionary Phase (EEP) provided by the MIST code (Dotter 2016;Choi et al. 2016). The ranges in EEP considered for each phase are EEP ≤ 495 for the Main Sequence Turn Off (MTSO), 630 < EEP < 640 for the Horizontal Branch (HB) of Rec Clump (RC), and 560 < EEP < 580 for the Tip of the Red Giant Branch (TRGB). These EEP ranges are not the default delineations between each phase of evolution given by the MIST team, but we found these choices to more closely match the corresponding drops in f det in Figure 2.
In Figure 2, we additionally see the expected trend with population age (decreasing f det with age) and that the metallicity most noticeably affects the tail-end of the f det evolution in distance, indicative of its effect on the luminosity of the RGB. Figure 2 shows that a 10 3 seconds exposure with Roman will be able to resolve (detect at 5σ significance as an isolated source) horizontal branch stars out to a distance of 5 Mpc in the F158 band, and TRGB stars out to about 20 Mpc. Figure 2 also nicely illustrates why 10 Mpc is usually thought of as a reference distance for resolved star studies with Roman, as it is past this rough distance that one can no longer easily observe most of the RGB over a large fraction of the parameter space (Spergel et al. 2015). It is important to note that 10 Mpc is not an absolute limit, Roman will be able to see resolved populations much further with longer exposures (subject to crowding limits, discussed further below). Though 10 Mpc can be thought of as a reference distance, especially for studies aiming to observe population features like the Horizontal Branch.

Calculating n tot
Now that we have provided a detailed look at the evolution of f det with various parameters of the population, we would like to provide some references for the parts of the observation that are independent of the particulars of what is detected. More precisely, for a given surface brightness (which is band dependent) and population, we ask what is the total density of stars that would be expected, both detected and undetected. The real quantity of interest will be a combination of this quantity with f det , but it is useful to have numbers in mind for the total density of stars. This is given by Equation 2 and its distance evolution is only dependent on the smooth, analytic evolution of the distance modulus so that (n tot ∝ d 2 at constant Σ B ). The exact normalization, however, will depend on the band and population parameters.
To give reference values for these normalizations, we present Figure 3 where we show n tot for several different values of metallicity (indicated by color), bands (F 062 and F 184 as indicated in the bottom right corner of each panel), surface brightness in those bands (indicated by the text adjacent to the curves in each panel), and population ages (indicated by line style) for the Padova isochrones. We see that, as one would expect, the surface brightness of the observed population most strongly determines n tot . We can also see the relative differences between the bluest (F 062) and second reddest (F 184) Roman bands, because the populations under consideration are generally dimmer in the bluer bands than the red bands (the average brightness is dominated by the reddest stars) it takes more stars per unit area in the F 062 band to reach the same surface brightness. This trend is also reflected in the ages, where the younger populations (dot-dashed lines) have fewer stars at fixed surface brightness, since their stars are (on average) brighter.

Calculating n det
Finally, the quantity of greatest interest to the observer is the density of stars that are detected, described in full by Equation 5. This quantity is essentially the 'product' of the quantities laid out in Figures 2, and 3. In Figure 4 we present n det in a similar format to that shown in Figure 3, where the calculations here have been performed using the Padova Isochrones.
We see that the n tot ∝ d 2 growth nearly compensates for the drop in f det over a large range in distance for a given surface brightness, leading to n det being a roughly constant function of distance over quite a large range. We also see the large differences between the F 062 band and the F 184 band, where the bluer band (F 062) detects a much larger number of stars at a fixed surface brightness, and its detection at large distances is much reduced compared to the redder band (F 184).
Additionally, we see that, in the redder bands, lower metallicity populations generally have larger n tot at fixed surface brightness (as shown in Figure 3). They generally have a much larger proportion of these stars below the detection threshold at most exposure times/distances, leading to the lower metallicity populations having slightly larger n det at nearby distances, but falling off much earlier than the high metallicity populations. However, this trend is reversed in the bluer band F 062.
We have thus far shown everything as a function of distance, but the observer will be mainly concerned with picking the correct exposure time for a given observation. In Figure 5, we give an example of this sort of calculation. We provide a sample of what n det would look like for a proposed observation of a Σ = 32 mag arcsec −2 (in each band), τ = 11.2 Gyr, and [Fe/H] = −2 population at a distance of d = 10 Mpc. In Figure 5, we indicate the point in exposure time at which the observation becomes crowding limited by the points at the end of each curve. These sorts of plots will be extremely useful in making decisions about how to invest exposure time. For example, we see that the most significant return for the observation proposed in Figure 5 is after t exp = 500 seconds in virtually all bands, after which the slope of each curve becomes significantly less positive. It is also striking that we have any stars at all detected for a 10 seconds exposure, even if it is ∼ 1 for every million square arcseconds. What we are seeing here is the tail of the asymptotic giant branch (AGB) combined with the fact that these stars are much brighter in the red Roman bands than one would typically expect in comparison to, for example, the optical bands of HST.

TEST AGAINST OBSERVATIONS
In this section we provide a test of our formalism against observations of resolved stellar populations in external galaxies from the ACS Nearby Galaxy Survey Treasury (ANGST) (Dalcanton et al. 2009;Weisz et al. 2011). To provide the simplest check of the code in its current state we wish to compare against an object of nearly constant surface brightness with as close to a single-metallicity, single-age stellar population as possi-ble. We therefore choose the galaxy FM1 (a.k.a. F6D1) which has a nearly uniformly old (∼ 11 Gyr) and metalpoor ([Fe/H] ∼ −1.25) stellar population as inferred from its resolved CMD (Weisz et al. 2011). Dalcanton et al. (2009) report 19,390 stars jointly detected in the F606W and F814W filters of the Hubble Space Telescope's Advanced Camera for Surveys (ACS) with a 50% completeness limiting magnitude of 28.92 and 27.85 magnitudes, respectively (in the Vega system). Detection in both bands is necessary for removal of contamination from background galaxies (e.g. Muzzin et al. 2013).
In order to make our prediction for the number of stars that should be detected in these observations we take the global surface brightness and size of FM1 from the Karachentsev et al. (2013) catalog. Specifically, it is reported that FM1 has an average surface brightness in the B band 8 of 25.8 mag arcsec −2 and an angular diameter of 0.89 . We also use the distance to the FM1 reported by Dalcanton et al. (2009) as d = 3.4 Mpc. In order to calculate the number of expected stars detected in the F814W band we calculate the population-averaged magnitude correction between the F435W (approximately B band) and F814W bands for a stellar population with age log 10 (τ /yr) = 10.05 and metallicity [Fe/H] = −1.25 which we find to be M F 435W − M F 814W = 1.7 so that our inferred average surface brightness for FM1 in the F 814W band is 24.1 mag arcsec −2 .
Using the 50% completeness limiting magnitude of 27.85 in the F 814W band as our B 5σ , and the same age and metallicity population used for the correction above we find f det = 4.2 × 10 −4 . With the apparent surface brightness of Σ F 814W = 24.1 mag arcsec −2 and d = 3.4 Mpc for the same stellar population we find the total number density of stars (detected and undetected) to be n tot = 5884 arcsec −2 . Taking the angular diameter reported by Karachentsev et al. (2013) this implies a total number of detected stars of 10, 795. If we restrict detected stars reported in Dalcanton et al. (2009) to this same assumed footprint we find N det = 5162. These numbers are not exactly comparable but in the calculation so far we have ignored the effects of extinction and we have assumed 100% completeness at the 50% completeness limiting magnitude, both of which would bias us to infer a larger number of stars than are actually detected. If we accept a limiting magnitude that is 0.5 magnitudes smaller (reasonable considering the 0.12 mag estimated extinction in the F 814W band; Schlafly & Finkbeiner 2011, and the lack of full completeness) we find a predicted number of detected stars of 5534. Given the assumptions of the calculation provided here and the fact that the Red Clump lies near the edge of the detection limit for FM1 (Dalcanton et al. 2009) we believe the agreement of the observed 5162 stars with the predicted 5534 stars is a successful test of this framework.

DISCUSSION OF APPLICATIONS
Now that we have produced the tools necessary to determine the sensitivity of observations to a wide range of resolved stellar populations at any distance and surface brightness, we can apply the tools to optimize observing efficiency for observations of nearby galaxies with Roman. Below we provide a few examples of how one might perform such optimizations. In Section 5.1, we start with optimizing filters for the number of stars detected in a given amount of observing time. In Section 5.2 we then discuss optimizing observations that wish to detect a given population feature. Lastly, in Section 5.3 we discuss the example of large halos where we would only cover a fraction of the structures in a single pointing.

Filter Choices -One Example
Generally speaking, the choice of filters will depend strongly on the specific science case under consideration or comparability with past measurements (and therefore similarity between filters that have been used in the past). With that in mind, we provide here an example of how one might go about calculating the best filters to use in the case that one is trying to create a color-magnitude diagram (CMD) of an observed population, with no particular interest in any specific part of the CMD. The main considerations in this case would be (i) maximizing the number of stars (probably applicable to many other science goals) and (ii) maximizing the difference in color between the filters used. In this case one will generally have a choice of one redder and one bluer band.
We can then apply our software to determine the optimal exposure time ratios between filters for a given population, and which filters will be best to use within the time constraints. In Figure 5, we can see that the F 062 and F 158 filters reach the highest n det in the range t exp = 10 3 − 10 4 seconds 9 , which additionally allows for a large color spread. This large color spread is especially important for the creation of CMDs from observations 9 We exclude the F 146 band due to its wide wavelength range, which prevents it from providing useful color information.
due to the better discerning power on stellar temperatures. Moreover, we can see that at exposure times of roughly ∼ 10 3 seconds the science return of number of stars per unit exposure time diminishes, indicated by the flattening of each curve. Thus in the case of the observation parameters given in Figure 5, the most efficient observing plan would be one that exposes F 062 and F 158 for ∼ 10 3 seconds. While the best filters to use should generally be a function of the population being considered, we would expect the F 062 and F 158 filters to generally be good choices for observations aiming to create a CMD, given considerations for depth and color differences. The F 087 filter could also be a good replacement for the F 062 filter at shorter exposure times, since it is also a bluer band and reaches larger n det at slightly shorter t exp .

Detecting a Given Population Feature
Another possibility is that the observing program requires that some feature of the stellar population, like the horizontal branch/red clump or TRGB, be detected in at least 4 bands (e.g., to allow for optimal background galaxy separation). This kind of optimization can be determined by producing plots similar to those shown in Figure 2, which shows the number fraction of the population that is detected as a function of distance at fixed exposure time. In a similar vein, we can also isolate individual population features and calculate the exposure time needed to detect them for a given distance. We give an example of this in the time to feature jupyter notebook provided in the code accompanying this paper. As an example, for a population with [Fe/H] = −2 and τ = 10 Gyr at 10 Mpc, we can reach one magnitude below the TRGB in 4 bands the fastest if we choose F 106, F 129, F 168, and F 184, and exposure times of 1777, 1001, 451, 866 seconds respectively. Given the lack of color information provided by the F 146 band, we have excluded it from consideration here.

Application to Galaxy Halos
As a final example, we explore the possibility of mapping a portion of the halo of a nearby galaxy to search for streams. We start with a model halo from Bullock & Johnston (2005). These simulations consist of stellar tracer particles created by tagging dark matter particles in the simulation and following them as they accrete on to the model halo. We create surface brightness maps from these simulations by projecting the particles along a given axis and binning them on a grid in the other two (non-projected) dimensions. We then calculate the total luminosity in a given band, B, from each of these particles by multiplying their masses by L B /m, where m is  (2005) halo in the F 062 filter, created as described in the text. The surface brightness is indicated by the color bar. This mock observation assumes that the galaxy is at a distance of 4 Mpc, which sets the scale of the axes in degrees and the size of the RST footprints (shown as light blue and light green tilings, chosen to lie at disjoint locations at different representative galactic radii). We additionally show a red bar that is 50 kpc in length at the upper left, for reference. Right: The number of detected stars, N det (solid lines), and the fraction of the observation that is crowded, f crowd (dashed lines), for the two different locations of observations indicated in the left-hand panel. The dotted green line represents the number of detected stars in a giant stream that overlaps the Location 2 field of view.
the average stellar mass of the IMF 10 . We then sum the luminosities in each grid cell (pixel), convert this to a magnitude and finally to a surface brightness by adding in the distance modulus and factor accounting for the sky area of each pixel. Since surface brightness is a constant function of distance, a distance is not needed to get the surface brightness map. However, one does need a distance to assign an angular scale to the pixels.
In the left panel of Figure 6 we show one of our surface brightness maps in the F 062 band. Several of the streams have surface brightness ∼ 32 mag/arcsec 2 , superimposed on a halo background of surface brightness ∼ 37 mag/arcsec 2 . This map does not include the surface brightness of background galaxies. As we noted in Section 2.3, our formalism generally ignores the crowding effects of foreground MW stars and background galaxies. The number density of the background galaxies could be as high or higher than the number density of stars detected from a stellar stream, as we will see below.
Subsequently, we perform mock observations of this halo (as carried out in the code provided with this work) by placing a synthetic Roman field of view on the image and predicting the number of stars that would be observed in each RST detector for a given exposure time 10 This assumes that each stellar tracer particle is massive enough to fully sample the IMF, which is not truly the case in general.
and two different placements of the detector. These detector placements were chosen to be physically disjoint from one another, in representative areas of the observed halo at different galactic radii. The Location 2 placement was also chosen to lie on top of a giant stream in the simulated halo. For this calculation we have assumed that the halo is at a distance of 4 Mpc. This calculation assumes a single stellar population, which is not true of the actual simulated halo, though this could be easily extended within walter. Furthermore, these stellar surface densities allow us to measure the expected amount of crowding, assuming that the crowding limit occurs according to the formalism laid out in Section 2.2. We show both the number density of expected stars, and the fraction of the observation that is crowded in the right panel of Figure 6.
Finally, we can break these numbers down into the number of stars that will be from a feature of interest vs. from the surrounding ambient halo. When interpreting these numbers, it is important to keep in mind that we do not take in to account background sources (as noted in Section 2.3), which would seriously affect the contrast here. For example, there is a stream with surface brightness ∼ 34 mag/arcsec 2 that crosses Location 2, where the ambient background has a surface brightness of ∼ 35 mag/arcsec 2 . In a detector that contains the stream, a total of 39,400 stars will be detected in a 1 hour exposure in F 062. 331 of these stars will belong to the stream. This may not seem like many stars relative to all of the stars collected in the observation, but the stream and stellar halo will be distinct in their spatial and color distribution in ways that are not illustrated by this simple comparison. We anticipate this code being used in future work to provide more quantitative constraints on the ability to detect these tidal features.

CONCLUSION
We have developed an open source public software package designed to optimize observing programs aimed at studying resolved stellar structures and resolved stellar populations. The software package can quickly calculate the number of stars one will detect in an observation if given the population age and metallicity along with the surface brightness, filter and exposure time. While this number will be useful for many planning purposes, observers must also keep in mind that in addition to the detected stars, which our software can predict, there will be background galaxies that will need to be filtered and taken into account for detection of features. The code is available on GitHub, has minimal dependencies, and is laid out with specific applications to the Roman Space Telescope.
We have shown how this software can be used to optimize observing efficiency for a few example programs that one might consider, including determining the best filter choices for a particular science case, detecting dwarf galaxies, and searching for stellar streams to a specific surface brightness limit. There will likely be many other cases for which this package will be useful, and we hope that it encourages the community to get involved in planning potential General Astrophysics Observations with the Nancy Grace Roman Space Telescope.