Theory of multiple-stellar population synthesis in a non-Hamiltonian setting

We aim to investigate the connections existing between the density profiles of the stellar populations used to define a gravitationally bound stellar system and their star formation history: we do this by developing a general framework accounting for both classical stellar population theory and classical stellar dynamics. We extend the work of Pasetto et al. (2012) on a single composite-stellar population (CSP) to multiple CSPs, including also a phase-space description of the CSP concept. In this framework, we use the concept of distribution function to define the CSP in terms of mass, metallicity, and phase-space in a suitable space of existence $\mathbb{E}$ of the CSP. We introduce the concept of foliation of $\mathbb{E}$ to describe formally any CSP as sum of disjointed Simple Stellar Populations (SSP), with the aim to offer a more general formal setting to cast the equations of stellar populations theory and stellar dynamics theory. In doing so, we allow the CSP to be object of dissipation processes thus developing its dynamics in a general non-Hamiltonian framework. Furthermore, we investigate the necessary and sufficient condition to realize a multiple CSP consistent with its mass-metallicity and phase-space distribution function over its temporal evolution, for a collisionless CSP. Finally, analytical and numerical examples show the potential of the result obtained.


Introduction
Stars are the fundamental constituents of a galaxy. Our understanding of galactic structure and evolution depends very much on the processes governing their birth, and evolution. The evolutionary time scales of stars, their energy feedback, yields of chemically enriched material into the interstellar medium, end products of their evolutionary history, and distribution in space and time characterize the structure of the galaxies and govern their evolution. However, all these stellar phases and products are often subject to uncertainties of both theoretical and observational nature, generating a lacking comprehension of these important issues. The effort to address these difficulties must be carried on in a dual way: with the collection of new data and with the development of new theoretical frameworks to interpret these data.
In the era of wide-field surveys, dealing with exponentially growing numbers of stars has become a challenge both for observational analyses and for their theoretical interpretation. In this contribution, we will address the latter. The difficulties of dealing with a large number of stars have influenced historically both the classical stel-lar dynamics and the classic stellar population theories. In classical stellar dynamics, from the few-body problem the attention moved to the mathematical formulation of a many-body problem starting from the pioneering works of Eddington, Chandrasekhar, and others who applied the concepts of statistical mechanics (e.g., the Liouville and Boltzmann equation) and the theory of the potential to "groups of stars" subject to a shared gravitational potential and hence described by a distribution function (e.g., Heggie and Hut, 2003;Saslaw, 1985). In the second half of the past century, a similar concept of "stellar populations" was used initially to address the fundamental equation of stellar statistics, the star-count equation (e.g., Seeliger, 1898;Trumpler and Weaver, 1953). This concept reached the astronomy research field thanks to the observer W. Baade and finally proliferated in the Galaxy modeling field in the 80s (see, e.g., Bahcall andSoneira, 1980, 1984;Bahcall, 1984;Ratnatunga and Bahcall, 1985). In these works, the idea of stellar population involves the photometry alone without phase-space treatment (e.g., Gunn et al., 1981;Tinsley, 1972Tinsley, , 1973. The first works attempting a global model generalization can be dated back to Bienayme et al. (1987), Casertano et al. (1990) and Mendez and van Altena (1996).
We want to merge these two concepts of stellar populations coming from classical stellar dynamics theory and classical stellar population theory, with the goal to precisely define the minimum condition under which these theories give consistent results. On the one hand, classical stellar dynamics defines a composite stellar population by its density profiles: its natural environment is the phase-space where position and momentum determine the distribution of the stars in the phase-space. On the other hand, classical stellar population theory defines a composite stellar population through its star formation history and initial mass function: its natural environment is the mass and metallicity space within which the stars move according to the fuel consumption theorem. To formulate a comprehensive framework able to account for both the theories is a difficult mathematical task. Here we limit ourselves to the investigation of a simpler, but no less important, task that tightly connects to the star-count modeling techniques. We cast the problem in the following way: if both classical stellar dynamics and classical stellar population theories determine the total mass of a composite stellar population, which is the condition for these approaches to coincide? While for one composite stellar population the answer is known, this is not true for two or more stellar populations. In this work we will derive it for the first time (see also Pasetto et al., 2018a).
The most remarkable advancement in the mathematical treatment of groups of stars (i.e., populations) probably happened at the beginning of the past century with the introduction of continuous functions: although stars are discrete elements, large gravitationally bound groups of stars sharing common properties started to be studied using continuous distribution functions (DFs) and continuity relations, rather than set-theory (i.e., stars by stars summations). This represented a great advancement with respect to the Celestial mechanics punctual treatment based on the 3-body/few-body problem, etc. In this work we introduce novel mathematical instruments, as the foliations, to address classical stellar population problems. Pasetto et al. (2012) introduced a new theoretical framework for the concept of stellar populations, and we here endorse and extend it to include multiple composite stellar populations (CSPs). This formalism has the advantage to include in the description of the classical dynamics of a CSP (based on the concept of distribution functions as well) the concepts that are natural to the theory of stellar populations (e.g., initial mass function, star formation rate, etc.). In the treatment that we are proposing, the star birth and death is formally included (hence changing the total number of stars) without any limitation on the nature of their dynamics. The formalism is correct both in the case of a collisional CSP of globular clusters, and a collisionless CSP of a galaxy. Furthermore, this formalism does not depend on the Hamiltonian nature of the dynamics (see Sec. 4).
The application of this general concept to the Milky Way (MW) has been presented in Pasetto et al. (2016) and will be reviewed briefly in the next section. We start recalling some basic concepts and definitions from Pasetto et al. (2012) and Pasetto et al. (2016) in Sec. 2.1. In Sec.2.2 we set the basis for the idea of multiple stellar populations. In Sec.2.3 we have a closer look at the necessary and sufficient condition for a system of the composite stellar population to be coherent in mass. In Sec.3 we present two numerical examples which highlight the potential of the theory, in Sec.4 we discuss our results and in Sec. 5 we draw our conclusions. The mathematical aspects of our work are detailed in Appendix A.

Basic concepts of a non-Hamiltonian statistical mechanics for CSPs
A composite stellar population, or simply CSP, is a set of stars born at a different time t, positions x, with different velocities v, masses M, and chemical compositions Z. We assume that every star lives in the space being the number of stars, and R + 0 the set of positive real numbers including zero)( 1 ). At each time t, a single realization of a CSP can be defined as a the s th set of points E s ∈ E defined by some arbitrary properties (i.e., the variable of state of the CSP). Following classical statistical mechanics arguments, we consider not such a single realization of a CSP (microstate), but an infinite collection of the CSPs characterized by the same macroscopic state average (e.g., energy, density, velocity dispersion, metallicity, etc.) but different microscopic conditions, i.e., different microstates s. If a point E s is representative of the s th -microstate we consider the set of all the {s, q} : E s E q at any arbitrary t. Because the ensemble contains an infinite number of 1 The choice of the domain of existence is arbitrary and made to exploit the following formalism. Other powerful solutions as M ⊂ (R + 0 ) N for the space of masses, [F e/H] ⊂ R N for the space of metallicity, and Γ ⊂ R 6 N for the phase-space, can lead to a formalism in E ⊆ (R + 0 ) N × R N × R 6 N that is potentially interesting but more distant from classical stellar population theory.
states, the change of the state variables of each CSP happens smoothly, i.e., continuously passing between neighboring states. This allows us to describe the CSPs by a distribution function f c : E → I ⊂ R + 0 with I finite interval of the real positive line including zero. Under this hypothesis, the evolution of f c is given by the Liouville equation for non-Hamiltonian systems (e.g., Colin, 1998) that we write as: with g(x; t) being the metric tensor of E introduced above, which is the classic Liouville equation generalized to (non-Euclidean) dissipative spaces, as we assumed E to be.
Hereafter ∇ x refers to the gradient over a set of basis coordinates x, •, • to the inner product, and ∂ t to the partial derivative with respect to the time( 2 ). Every time a system presents irreversibility, e.g., the system presents dissipative processes, gas-processes, friction, interaction, merges, etc. it is non-Hamiltonian and non-Hamiltonian statistics has to be used to describe its irreversible dynamics. We can express the Eq.(1) by introducing the evolution operator ιE [•]: with ι complex unit, f c ≡ √ g f and d dt [•] the total derivative operator( 3 ). As mentioned above, in general the CSPs are non-Hamiltonian entities, and their total number of stars is not conserved. The only hypothesis that we require for Eq.(2) is that the DF is sufficiently smooth so that the necessary derivatives exist; we will assume for simplicity that f c ∈ C ∞ (E) (i.e., the set of continuous functions with infinitely continuous derivatives). The formal solution of Eq.(2) is then where, mutating the name from quantum mechanics, we call e −ι Et [•] the evolution propagator. Here we can decoupled the operator E [•] linearly, in such a way that E [•] is split in a part granting the evolution and normalization of f c given by ιL [•] (standard Liouville operator), and in a part accounting for the compressibility of E in the case of external fields, say ιΛ [•], whose function is to account for the compression of the phase-space without changing the number of stars. Finally, a third part, say ιF [•], accounts for the rate of change of the number of stars in the stellar population, N = N (t).

Multiple stellar populations
We will focus our attention on gravitationally bound systems of collisionless/collisional stellar populations, composed of N = N (t) < ∞ stars as long as it is possible to identify unambiguously every star. We formally need only the enumerability of the stars, that for the purposes of the normalization of f c in E can be considered identical indistinguishable elements. ( 4 ). We will ask also for a slightly more restrictive hypothesis of phase-space mixed CSPs in Γ, with a detailed-balance in Z, and non-interacting stars (e.g., we exclude interacting binaries). These hypotheses are in agreement with our request of continuity for the temporal evolution of f c and with the Liouville description introduced in Eq.(2) thus granting the possibility to be always able to disentangle two different CSPs in their evolution of time in the spirit of a non-Markovian evolution( 5 ). In this way, we assume that it is always possible to know the position and velocity of each single star in E, so that the concepts of distribution function in the phasespace and average metallicity of the stars are always well defined. This argument implies that a trajectory gives the evolution of a system in the extended E ′ ≡ (E; t) space, and two different initial conditions lead to distinct nonintersecting paths in E ′ called CSP orbits in E, x = x(E; t) 6 .
The equilibrium hypothesis for f c in the whole E does not hold strictly, i.e., there is not a globally defined f ∞ that holds over the entire space E∀t and to which the CSPs tend with increasing time. However, on limited-volume subsets of E and limited time intervals, we will be still able to define "stationary states" under a suitable hypothesis for the two-body relaxation time in Γ or time-independent main-sequence phases in M × Z. Hence, as expected by stellar dynamics and stellar structure standard theories, we will consider galaxies not as ergodic systems, but we will let isolating integrals to exist, and to foliate the phasespace Γ thus allowing us to speaking about, e.g., "families of orbits" in Γ. In the same way, low-mass stars can live on main sequences with extremely long timescales where f ∞ is virtually time-independent.
The theoretical framework developed in Pasetto et al. (2012) for E found an application to the case of the Milky Way (MW) in Pasetto et al. (2016). We will not repeat it here, but we will focus on some aspects of the normalization with the intent of digging deeper into the constraints implied by such a formal approach. Figure 1: Foliation of the manifold E in leaves F s . Each leaf is parallel and disjointed from any other leaf and covering the spectrum of masses, the co-dimension p = 1 so that R p = R 1 ≡ M while the left 6N-1 dimensions, i.e., the number of particles and the metallicity, are arbitrary but fixed, i.e., constant. Hence from the figure we can see that the section for x 1 = M cnst. is not constant, but x 1+1 = x 2 = Z = cnst. and so forth up to the section x 1+6 N are all constant by construction.
We need to state clearly two definitions( 7 ): Definition 1: a simple stellar population (SSP) is a subset of E at constant Γ and Z.
This represents the fundamental unit from which to construct the theory of CSPs. A collection of stars born at a given time t, with a single metallicity Z, and with a range of mass M SSP ∈ [M min , M max ] represents a line in E parallel to the M-axis. A set of these lines at the same t for fixed Γ, but spanning a range in metallicity Z, represents a CSP. Remembering that dim Γ = 6N. it results natural to proceed with the following: Definition 2: a 1-dimensional class C ∞ foliation F of the 6N + 2 dimensional differentiable manifold E (called space of existence) is a decomposition of the E into a union of disjointed connected SSPs (otherwise referred to as leaves F s of F ), i.e., E = F s , with the following property: Every point E ∈ E has a neighborhood I ⊂ E and a system of local C ∞ coordinates x = M, Z, q 1 , ..., q 3N , v 1 , ..., v 3N : I → R 6N+2 such that for ∀F s the components of I ∩ F s are described by the equations (see Fig.1): From the definition of a SSP, the mass function of a CSP at a given arbitrary time t reads (see also Eq. (6) in 7 We will leave the time dependence explicit in our equations as far as possible to develop our consideration in parallel with the original general formalism presented in Pasetto et al. (2012). Moreover, although the Dirac notation is a winning one on the operators' algebra, we feel that the integral notation exploited in Pasetto et al. (2012) is more common in this astrophysical context and we will keep using it here. Pasetto et al., 2012): with δ (•) being the multidimensional Dirac's delta and s ∈F • the sum over all the SSPs. Integrals are supposed to extend over all the existence space unless stated otherwise. Eq.(5) is clearly equivalent: where Γ = Γ (t) and Z = Z (t) are the projections of the map x : R 6N+1 → R, i.e., the orbit of E in the section Γ and Z of E with x = {Γ, Z }. In Eq.(5) the time dependence is due to the distribution function f c , and the • s quantities for the SSPs are integrated over dZ dΓ. Eq.(6) transfers the time dependence to the phase variables being the two visions equivalents. The reader familiar with quantum mechanics can recognize in Eqs. (5) and (6) the equivalence between Schrödinger representation and Heisenberg representation of the temporal evolution of the state-variables. In Eq.(5) the time dependence of the state variable (in our case the mass) is due to the distribution function f c (Schrödinger representation), and the • s quantities for the SSPs are integrated over dZ dΓ. Eq.(6) transfers the time dependence to the phase variables (Heisenberg representation). While foliating the space E, the sum considered in Eq. (5) is weighted by f c that will add from F only the non-null contributions. In the same way in Eq.(6), and in the following Eq.(8), the temporal evolution of each function accounts accordingly for the sum over the leaves of F. We now can make use of the definition we made for the foliation of E. We know that Γ = Γ (t) is the solution of the Hamilton equations Γ = Γ (t), and Z = Z (t) is the chemical enrichment law of the population. Nevertheless, (4)) because we foliated the space on SSPs, and by definition every SSP is a set of stars at a given position in the space Γ and at a given metallicity in the space Z. Note that we need to have a formal expression for the equations of motion in Γ but we do not need them to come from an Hamiltonian vector field, e.g. by writing Γ = J ∂H ∂Γ (with a J standard symplectic matrix), nor we need any explicit form for the metallicity enrichment law Z = Z (t). Furthermore, while for every leaf Γ = cnst. and Z = cnst. (which is the way the F s are populated), in general it is M s (t) 0∀s because in the same SSP the mass evolves with time following the fuel consumption the-orem( 8 ) (Gunn et al., 1981). Hence, Eq.(6) reads: with f s referring to a DF for a SSP, while in the last line an initial mass function remains naturally defined aŝ ξ s ≡ N f s (M (t) ; 0) at each instant t. The shape of the mass function over each SSP F s , is invariant, i.e., the mass functionξ s for each SSP does not have an explicit temporal dependence, as expected from the concept of a SSP (Def. 1) which relies on a coeval set of stars. The rate of change of the mass with time ∀s is specific to the SSP considered at the time envisaged for the SSP, and indeed it depends on the fuel consumption theorem (Gunn et al., 1981) which allows us to write: For each SSP the contribution to the mass function is given by the product of a function that depends only on the mass, say an initial mass function (IMF) ξ c (M), times a function that for each mass depends only on the time, say the star formation rate (SFR) ψ c (t). We define them for the collective of the CSPs as: From the statistical point of view, Eq.(9) tells us that the probability to find a given star in the volume dMdt and a second star in the volume dM ′ dt ′ is just the product This assumption is not strictly necessary to derive the total mass of a CSP, as we shall immediately see here below, but it is a commonly accepted assumption that follows from the supposed independence of the star formation processes over the time t( 9 ). From Eq.(7) and Eq. (8) we obtain the total mass of a CSP (for a given interval of interest ∆t = T − t 0 > 0 referred to as the age of the CSP): 8 This is in general not true in the case of presence of close binaries (excluded for the purpose of this work) where the mass exchange through Roche lobes can play a major role in shaping the form of M = M(t) and we loose the enumerability of the stars. 9 Mathematically, the multiplicative separability is always possible if and only if (hereafter "iff") the functions are continuous and strictly positive (as in our case) because of the Kolmogorov Arnold "representation theorem".
with M l and M u being the lower and upper limits in the mass range considered in the CSP, and c • the sum over all the CSPs with at least a period of non-null SFR inside ∆t.
For example, let us suppose that we want to consider the different history of formation of the different CSPs in a completely isolated model of the MW (i.e., by considering CSPs of the galaxy as the Galactic bulge, disks, halo, spiral arms) in this unified formalism. While the total mass of the CSP M tot , is fixed because the system is assumed isolated, the contribution coming from the halo is built up much earlier in the history of the MW than the input to M tot given by spiral arm populations, which contain predominantly young stars. Hence for t 0 = 0 and T 13.8 Gyr (i.e., the age of the universe) the contribution in Eq.(10) from any CSP will be: where, for reasons that will be clear soon, we wrote the IMF for a CSP as the product of a constant ξ 0 and a func- , and the star formation rate ψ c as the product of a constant ψ 0,c times a functional form The different CSPs that we need to consider have non-null SFR only within specific temporal intervals, e.g., the SFR of spiral arms in the MW can be chosen as a non-vanishing function only in the last 0.5 Gyr while the SFR of the MW halo is a non-null function only in the past 12-13 Gyr. For this reason it is convenient to write and t ini,c and t end,c > t ini,c are the initial and final ages of star formation for the c th CSP considered. Here ϕ c is the functional form representing the SFR (e.g., an exponential, a linear profile, etc., see what follows) that we need to nullify outside a temporal interval of interest. We achieve this behavior with a "torii"-function (named after the traditional Japanese gates) i.e., a gate function that nullifies ϕ c (t) outside the limits t ini and t end . One possible solution to achieve this functional feature is with a composition of Heaviside θ = θ (z) functions as follows: moreover, this torii function will be useful again later in a different context. For example, for the spiral arm CSPs, it can be assumed t ini = 0.001 Gyr and t fin = 0.5 Gyr since there is no evidence in the MW for the spiral arms to be as old as 13 Gyr. To obtain the total mass of the CSP, in Eq.(11) we assumed the normalization constant to be unique. Conversely, how the stars of a CSP evolve can differ from population to population and the normalization constants ψ 0,c are left to vary from CSP to CSP, and we specify it with the individual index in Eq. (12), c = {spr, hal, thn,...} for "spiral", "halo", "thin disks", respectively: where ψ 0,c are constants to be determined as indicated in Sec.2.3.
We obtain now the same total mass of the CSPs by following a different path throughout the equations. We consider the propagator of f c as in Eq. (3): (16) This equation can be simplified further under the assumption of the collisionless behavior of the stellar populations under examination. In the case we are interested to galaxies, the long two-body relaxation time hypothesis allows us to pass from a 6N degree of freedom description to a 6-dimensional phase-space with the volume elements dγ = (d x, dv), by introducing the Boltzmann collisionless operator B to substitute the Liouville operator L. In this simplified picture Eq.(2) becomes ∂ t f c = (B + F ) [ f c ]. For the Hamiltonian systems we then recover classical stellar dynamics results (e.g., Bertin, 2014) for the density of the CSP as: whose integral over the configuration space will yield the total mass at the instant considered. To account for the temporal evolution of the density profiles is not a trivial task, and as it often happens in stellar dynamics we are interested in a match with observations of ρ tot (x; t) at the present time, i.e., ρ tot (x; T ) = ρ tot (x). Hence, at the present time we can write: where M c ≡ ∫ d x ρ c (x) for every CSP, c. This is the wellknown relation for collisionless galaxy dynamics defined in Γ, which is here obtained starting from the DF, f c , defined in E. It represents the same quantity found in Eq.(10) starting from the same f c in E but obtained by following a different path through the equations.
When are these mass determinations (from Eq.(10) and Eq. (18)) equivalent? What is the condition for the consistency of these two mass determinations? In a mathematical formulation we can recast the question as follows: when does the relation hold at a given time t? When does this equation have at least a solution? Is it unique and how to determine it?
We move our new goal to the research of a consistent mass determination so that the total mass M tot derived trough the standard stellar population theory, i.e. Eq.(10), coincides with the total mass arising from the density profiles, Eq.(18). The only parameters left to be determined are the constants ψ 0,c for multiple stellar populations. We show how to achieve this in the next section.

A fundamental mass consistency condition for collisionless multi-stellar populations synthesis
To answer the questions left in the previous section we proceed with the following definitions and by formulating the questions in the form of a theorem. We define as consistent a system of stars for which the following definition holds: Definition 3 [Consistent galaxy stellar population]: Given a collisionless CSP identified by the distribution function f c ∈ I ⊂ R + 0 in E ≡ M × Z × γ (I finite interval of the real positive line including the zero), for which the multiplicative separability of its mass function is given bŷ ξ c = ξ c (M) ψ c (t), it is said to be consistent if it obeys to the fundamental relation: Eq. (19) is a well-posed definition every time f c is nonnegative ( 10 ); a fact that always holds as a consequence of the definition of f c as a distribution function( 11 ).
To understand when a set of CSPs can be said to be consistent for the case of a collisionless stellar system is the goal of the following theorem.
Theorem [Collisionless multiple stellar populations consistency theorem (MSP-CT)]. Given a consistent composite stellar population (CSP) in the existence 10 This is a result sometime referred as Tonelli's theorem. 11 To make this definition explicit has also the aim to avoid nomenclature confusion with the concept of "dynamical consistency" used in stellar dynamical theory which means that given f tot = 6 space E = M × Z × γ defined by a DF f c ∈ I ⊂ R + 0 (I finite interval of the real positive line included the zero), we assume that multiplicative separability of CSP mass functionsξ c , i.e.,ξ c = ξ c (M) ψ c (t) holds, where for every CSP ξ c (M) = ξ 0 Ξ c (M) and ψ c = ψ 0,c Ψ c (t). The CSP is consistent iff the system of equations: has at least one solution. In this case, the IMF normalization constant is The SFR normalization constants are given by: where sums and products are assumed to run over all the CSPs.
Proof : The proof of Eq.(20) has been gradually achieved above with the passages from Eq.(11) through Eq.(18) once Defs. 1, 2, and 3 are considered. With the lemma in Appendix A, we conclude ✷.
Note that the collisionless nature of the CSP is an implicit hypothesis hidden in the definition of "consistency" of the CSP and it is necessary for the validity of the MSP-CT in the passage of Eq.(17).

Functional forms
To fully exploit the theorem (a couple of examples will follow, and see also Pasetto et al. (2018a)) we show a few self-standing results that are useful in handling the integrals in the previous theorem. These results represent the "tools" to build up analytically a consistent set of CSPs once the MSP-CT is used.

Star-formation-rate profiles
We will consider four star-formation-rate profiles: 1. Constant SFR. We assume a constant star formation between two instants t 2 > t 1 > 0: identically. Considering t G > t 2 > t 1 > t 0 > 0 and remembering that for the Heaviside theta function it holds the relation ∫ θ (z) dz = zθ (z) + cnst., the integrals in Eq.(22) yield: 2. Exponential SFR. We consider a profile with h τ ∈ R\ {0} non-null time scale length of an exponentially in/decreasing profile. We find for the integrals in Eq.(22) (with t G > t 2 > t 1 > t 0 > 0): 3. Linear SFR. We investigate a linear pattern for the SFR between two assigned times, i.e., a shape Eq.(22) (with t 2 t 1 , ψ t 2 , ψ t 1 all positive numbers) is then integrated entirely analytically (under the assumption of the previous case 1. and 2.) as: It is clear that this kind of profiles once considered together with Eq. (12) can be combined to achieve any global SFR desired (see also Eq.(12)), where the age and metallicity relation are the result of a piecewise function. 4. Rosin-Rammler SFR. Finally, it is of interest to present a SFR of the form (Rosin-Rammler, 1933): under the condition that t 2 > t 1 > 0, 1 β > 0 is constant, and h τ > 1 (see, e.g., Chiosi 1980, Grieco et al. 2012 for an extensive investigation of this family of profiles in relation to the MW chemical modeling or Matteucci (2012) for a review on the theory of chemical evolution of stellar populations). The integrals needed in Eq.(22) read: where with γ (a, z) = ∫ ∞ z dte t t a−1 we indicated the incomplete gamma function.

Mass function profiles
We will consider three initial mass function profiles within preassigned mass limits M ∈ [M l , M u ]: 1. Single power law IMF (e.g., Salpeter, 1955). The integrals involved in Eq.(20), with and α = cnst. yield: 2. Piecewise linear functions are very popular in the literature (e.g., Kroupa, 2001;Scalo, 1986). Hence it is worth to consider in detail what the normalization process required by the consistency theorems implies for these profiles. We require a single normalization factor for all the piecewise linear functions (ξ 0 is unique as the total mass) so that continuity of the piecewise linear functions for different mass intervals requires a different scale function ξ α i 0 for each mass interval, say M ∈ [M i , M i+1 [: where the function τ is the same previously introduced in Eq.(13) and N sl is the number of slopes in the polygonal IMF considered. We need to determine the coefficients, ξ 0,α i , which grant continuity of the IMF in the points of connection of two consecutive slopes. Hence we solve the recurrence equation for the unknown generic coefficient ξ where we imposed an arbitrary condition for the global normalization in the first coefficient ξ 0,α 1 = ξ 0 to the recurrence equation. The solution of the previous equation with this condition reads: Hence, in this way, we generalized the polygonal function with( 12 ) The most interesting case is for the number of slopes N sl = 3, i.e., where i = {M l , M 1 , M 2 , M u } are the lower mass, the first and second separation masses of the profile slopes, and the upper maximum mass considered respectively. In this case, for the integrals involved in Eq.(20) we get: 12 Note that fixed values of mass are referred to as M while the variable mass is referred to by the italic symbol M throughout the paper.
where in the last line we made use of the Eq.(32). 3. Lognormal IMFs define a commonly used parametric family of profiles for stellar systems often used in combination with power-laws. We define them as: in conjunction with power-laws as in Chabrier (2003) or Miller and Scalo (1979). They can be equally fully integrated just noticing that the indefinite integrals with C a and σ M as normalization constants, and erf(•) is the Error function.
These four profiles of the SFR and three of the IMF represent all the tools necessary to work with the previous theorem. With these fully analytical integrals at our hands, we can solve two numerical examples to show how the previous theorem acts. A sophisticated multi-stellar population model based on the MSP-CT is presented in Pasetto et al. (2016) and Pasetto et al. (2018b), and available on-line at www.galmod.org.

A simple model of the Milky Way potential
We numerically test the validity of the Eqs.(20) involved in the MSP-CT. For this exercise, we choose to build up a simple MW potential (Table 1). This Table presents a perfectly functional MW model matching the major observational constraints on the MW potential. With Table 1 and the density profiles in Pasetto et al. (2016), we obtain a total mass for the MW within 100 kpc of M 100 0.8 × 10 12 M ⊙ , the rotation curve at the solar location, v c (R ⊙ ) = 228 km s −1 , the fraction of disk mass over the spiral component mass,  This benchmark model can be tested by setting these values in the on-line galaxy model web page 13 It is beyond the goal of this paper to review the equations and the observational constraints considered in the MW potential. Nevertheless, in Pasetto et al. (2016) we presented the equations adopted to compute these values as well as a review of the most relevant observational constraints on these values.  Pasetto et al. (2016). Here we just mention that M B , h r, B are total bulge mass and radial scale length, ρ D , h R , h z , Φ a 0 , h a spr , m, Ω p , p, h S are central density, scale length, scale height, perturbation amplitude, spiral arm or bar scale length, total number of spiral arms, angular pattern speed, pitch angle, and shape function scale length, respectively, for all the disks exponential profiles and ISM. ρ 0, H * , h r H * , α are the stellar halo central density, scale length, and density slope, respectively, and v 0 , h r, D M , q are the scale velocity, scale length and flattening factor of the dark matter profile. Finally, σ R R ⊙ is the only velocity dispersion tensor component necessary for the CSP considered along the principal axis of the system of reference of the population.

Components
Scale parameters ∆t (www.GalMod.org) of "GalMod" (Pasetto et al., 2016(Pasetto et al., , 2018b and setting to zero the density profiles of the thin disk population n o 3, 4 and 5. These parameters are obtained by minimizing a distance function in the parameter space from the best-values presented in Pasetto et al. (2016). The kinematic parameters presented in Table 1 and not involved in this exercise are left for completeness and were obtained by averaging the parameters in Table 2 in Pasetto et al. (2016) (i.e., they are not obtained from a direct fit of the data as for Table  2 in Pasetto et al., 2016).
Taking the two CSP, e.g., spiral arm and thin disk, we can assume a constant star formation rate for the spiral arm over the first t ∈ [0.1, 0.9] Gyr. We exclude the first 100 Myr where we cannot properly speak of the "stellar population" because the stars are assumed to be still embedded in a collisional/dissipative environment, i.e., inside their parent molecular cloud or OB association locus. Note that the star formation rate is not inserted in M ⊙ yr −1 because the role of the MSP-CT is to ensure the correct matching between the amount of mass that results from the density profile parameters adopted (Col. 2 in Table 1). For the star formation rate over the past 10 Gyr, t ∈ [0.9, 10.0] Gyr, we want to provide a more articulate profile for the SFR considering the significant temporal extension. We opt for Eq.(29) where we choose {β, h τ } = {2.0, 1.1 kpc} (e.g., Just and Jahreiß, 2010;Just et al., 2011). Because spiral arms came just from a perturbed distribution of a thin disk unperturbed mass distribution, both for the thin disk component and the spiral-arm component the total mass will be given by Eq.(45) in Pasetto et al. (2016) for R max → +∞: where just for the purpose of this example, Ξ spr , Ξ thn , Ψ spr , and Ψ thn are chosen to be Eqs.(32), (37), (24) and (30) respectively, with IMF parameters from Salpeter (1955) and Kroupa (2001). Once we have obtained the normalization coefficient we can compute the number of stars that would fulfill the mass distribution ρ for an IMF populated with masses in the interval M ∈ [M l , M u ]. For the case considered above, we quickly obtain from Eq.(21) with Eq.(41) that so that the total number of stars for these two CSPs is N spr = 1.240 × 10 10 N thn = 3.618 × 10 10 . This result evidences the primary goal of the theorem: it adjusts the normalization functions so that the total amount of mass in stars realized by the star formation processes (with the assumed IMF) matches (at the instant considered) the total mass of the density profiles that generate the potential. Eq.(22) seems to be the only available option to compute algebraically the consistency condition for multiple stellar populations: even an algebraic software manipulator as Mathematica (Ver. 11.1.1) seems not to be able to produce algebraic solutions for N greater than two, giving output for ψ 0 that is at least a few pages long and virtually impossible to check and implement. Vice versa the explicit formulation presented in Eq.(22) allows us to easily handle many CSP's ψ 0 in a fully algebraic manner. Moreover, it allows the determination of the number of stars at the instant considered in concordance with the density profiles (and hence potential), IMF, and SFR.

Numerical solution of the star-count equation along any FoV
To be able to determine the number of stars in a field of view dΩ along any line of sight, it is of paramount importance to investigate the distribution of mass that generates it and, in turn, the underlying global gravitational potential. In this example, we show how to use the previous theorem to obtain this significant quantity. For the sake of this exercise, we will omit the observational colormagnitude diagrams investigation which assume it possible to observe all the stars within a mass range in theΩ of interest. Differently from the previous exercise, we exploit here the configuration space dependence of the MSP-CT i.e., the left-hand side of Eq.(20). We base this example again on the potential of Table 1, and we solve Eq.(20) along an arbitrary but fixed direction.
To test the central bulge/bar model decomposition (whose details are introduced in Pasetto et al., 2016Pasetto et al., , 2018b, we choose two small fields of view (FoV) in the di- We compute the same equations of the previous exercise. For the six stellar populations we evaluate six integrals I ψ,i for i = 1, .., 6 from Eq.(29), (29), (23), (29), (23), (22) for the bulge, bar, thin disk 1, thin disk 2, thick disk, and halo, respectively (with parameters as in Table 1). IMF profiles are taken from Eq.(35) and the integrals I Ξ,i for i = 1, .., 6 computed accordingly. The integrals of Eq. (18) were computed in cones of increasing size throughout the FoV directions. In Figure 1 we plot the relative number of stars N (R) normalized to the central galaxy value N (0). The two FoVs start with the same number of stars in each FoV and the N (R) trend is dominated by the "cone effect" of the opening angle. Nonetheless, already at about an heliocentric distance r hel 4 kpc the difference in the stellar density profiles due to the different direction starts to be visible. The non-axisymmetric effects are mostly visible around r hel = 7 kpc where the l.o.s. along the positive longitude meets the bar overdensity and grows while the negative longitude does not show the bar effect (further details in a dedicated paper, Pasetto et al. (2018b)).

Discussion
Every time a system presents irreversibility (i.e., dissipative processes, gas driven processes, friction, mergers, etc.) then non-Hamiltonian statistics has to be used to describe its dynamics. The galaxies do not represent an exception. Their stellar component origins from gravitationally bound clouds of gas that evolve converting (in an irreversible way) the hot gas to molecular gas, then to stars and again to chemically processed gas ejected into the ISM in an irreversible cycle of gradually increasing entropy (in an isolated system). We described this continuous dynamic in a space E from the point of view of stellar populations, i.e. a set of discrete elements (stars) that are allowed to be created, to evolve and to die while moving in the space. To realize a sounder mathematical setting for the concept of the stellar population, we made use of foliations in the existence space of the stellar populations in SSPs.
This framework has the advantage of dealing with integral quantities instead of the discrete set theory (i.e., with distribution functions on well-behaved manifolds). In Pasetto et al. (2016) it has proven its enormous advantage by solving the classical star-count equation outside the "small FoV" framework. This resulted in a star count solution for a large FoV that is a particularly promising result when considering the ever increasing datasets stemming from current and upcoming partial or whole-sky surveys.
The second and more significant advantage inherited from the concept of the distribution function is the full incorporation of the dynamics in the stellar population treatment. In the existence space E, the distribution function is treated with a mathematical formalism borrowed from quantum mechanics and the fundamental units are represented by a time-invariant element, with which the existence space E can be foliated (i.e., SSPs). This offers an elegant theoretical formalism and a rich mathematical background from quantum mechanics to exploit. Finally, the extension of the concept of stellar populations to combine the classical stellar population theory and the stellar dynamics theory as presented here aims to give a solid mathematical basis for the classical research on stellar systems as described in E (e.g., pioneered by Bienayme et al., 1987;Mendez and van Altena, 1996;Robin and Creze, 1986).
We note how our concept of foliation introduced in Sec.2.2 can be naturally pushed further to describe the phase-space of collisionless systems, say γ, as a special case. If we consider a collisionless stellar system such as a galaxy, e.g., our MW, and we assume it to evolve in complete isolation (i.e., we exclude tidal interactions with the dwarf companions, complete phase-mixing in γ, streams, star cluster inside the galaxy, binary interactions, etc.) we can try to exploit the Jeans theorem to foliate γ by (isolating) integral of motions (e.g., Lynden-Bell, 1962). In this case, we are able to write the DF for each SSP as f SSP = f SSP M, Z, I 1,..,n for I i integrals of motions in γ. Unfortunately the limitations in the applicability of this approach can be severe. The observations of the MW in particular show the existence of a bar in the MW center and spiral arms (i.e., non-inertial CSPs that slow down kinematic heating), accretion events (e.g. from dwarf galaxies) that apply torque to the MW angular momentum, etc. All these events induce a violation of the conservative (i.e. Hamiltonian) nature of the systems. Probably the most prominent example of a non-Hamiltonian, timeirreversible system is our own Galaxy. Modern research to overpass the limitations of the Hamiltonian (or actionbased) formalism is lead by N-body numerical simulations (e.g., Genel et al., 2014;Kawata and Gibson, 2003) or analytical studies (e.g., Cubarsi, 2010).
Finally, we stress how MSP-CT not only offers consistency in the existence space of the CSP, but it also provides the number of stars in an entirely general setting (without symmetry conditions on the underlying stellar populations). It sets the basis for the star count technique (Pasetto et al., 2016(Pasetto et al., , 2018b. The theorem does not claim the uniqueness of the solution. The theorem is indeed the outproduct of an average procedure on all the possible microstates of the CSP, i.e., on an ensamble as introduced in Sec.2. To unequivocally specify the microstate is beyond the framework of the theory and it would require the specification of the evolution operator E in detail. In Eq.(2) we should give explicit formulation to the Liouville operator ιL [•] thus relating f c with the total gravitational potential through the Poisson equation ( 14 ), to the compression operator ιΛ [•] (to account for the presence of gas influencing the dynamics of the system or the presence of 14 See footnote 10 external systems) and finally to the ιF [•] thus accounting for the change in the number of stars in agreement with the equation of stellar structure. In particular, in the extended space E ′ for a given input physics (equation of state, nuclear reactions, opacity, etc.), the mass and chemical composition of a star, the structure and hence the position on the Hertzsprung-Russell diagram (HRD) should be uniquely determined (Kippenhahn et al., 2012). Unfortunately, the Vogt-Russell Theorem has never been proven on strict mathematical basis, and sometimes the presence of loops in intermediate mass stars at given input physics seems to have an erratic behavior: two stellar models with the same internal structure seem to correspond to two different locations on the HRD (one red and the other blue) thus resulting in a violation of the Vogt-Russel Theorem (Lauterborn, 1973(Lauterborn, , 1972. In relation to this, it is worth recalling that all stellar models are calculated with numerical methods so that the claim that two stellar models are identical is always hampered by this inherent drawback. In any case, several decades of systematic applications of stellar evolution theory and their results (isochrones, synthetic HRDs, etc.) to study stellar populations in clusters and fields, have always provided a consistent interpretation of the observational data. To conclude, we are inclined to consider the Vogt-Russell Theorem always verified and the stellar models in use "unique" even though further investigation is required.
This theorem is neither present, nor are there any similar consistency conditions implemented in any of the available star count models proposed in the literature, such as the Besançon model (Robin et al., 2003), Trilegal (Girardi et al., 2005). Every time the gravitational potential is involved in the generation of the stellar kinematics, all the stellar populations must be considered simultaneously (because their combined gravitational potential enters in the collisionless Boltzmann equation). In this way, the fundamental theorem of mass consistency must be applied to generate the number of stars even when the model works in the small field of view approximation (i.e., whenever gradients of the density distribution are not relevant, see Pasetto et al. 2016).

Conclusions
In this paper, we investigated the mass-consistency relation between the concept of stellar populations as intended by the classical stellar dynamical theory and the classical stellar population theory. This is done in the framework of a generalized concept of the stellar population developed by Pasetto et al. (2012), where phase-space, mass, and metallicity of the stars are treated using distribution functions defined in a suitable existence space. We obtain a condition (expressed in the form of a theorem) that has immediate applications to a star-count model technique.
Traditionally, a population (here a stellar population) is considered as a set of elements (stars) sharing common properties( 15 ). Because of numerous stars typically involved, it is often more convenient to speak about distribution functions in a suitably defined manifold of existence for the stellar population. This framework is commonly adopted in stellar dynamics and was exploited for the first time in stellar populations by Pasetto et al. (2012).
When does the total stellar mass of the galaxy obtained by stellar dynamics theory equal the total stellar mass obtained by stellar population theory? In this work, we found a new answer in the form of the MSP-CT to this old question( 16 ).
We started exploring what we can learn by endorsing the Pasetto et al. (2012) formalism. In particular, for the case of a galaxy (i.e., an approximatively collisionless-dynamics stellar system), we obtained a theorem (MSP-CT) that proves how the solution of the classical condition of equivalence between stellar dynamical mass M dyn , and stellar population mass M str , say M dyn = M str , always exists (it is not unique, as seen in the Lemma appendix A) and it is given by the set of Eqs.(20), (21), and (22), obtained for the first time here.
Aside from the mathematical formalism, the physical interpretation of the MSP-CT can be understood as follows. The relative contribution to the total mass of two or more stellar populations depends at every instant on their relative density distribution according to their star formation history. The way in which the total mass is distributed among the stars (or between the different CSPs) depends on the SFH and the IMF of these populations. As time elapses, the stellar population ages and the stars leave the main sequence, die, or enter a quiescent stage (white dwarfs, neutron stars). During their life, they recycle material and enrich the interstellar gas violently (e.g., as supernovae) or quietly (as stellar winds). The metal abundances increase due to both self-enrichment by the parent SSP and the contribution from all other stellar SSPs. The way in which these stars are distributed at every time t is determining their number and their overall mass in any arbitrary volume of the galaxy, a result that is given (in an analytical way) by the mass-consistency theorem for composite multiple-stellar populations (MSP-CT).
The applications of the MSP-CT are left to a dedicated paper that introduces GalMod and its features (Pasetto et al., 2018b).
Furthermore, our formalism is not limited to galaxies. We worked out a framework that extends the applicability of the concept of multiple composite stellar populations to a large variety of gravitationally bound stellar systems. The concepts we have developed in this paper can be applied straightforwardly to studies of globular clusters with single or multiple stellar populations (e.g., Milone et al., 2012;Norris, 2004), to the Milky Way observed along any line of sight (e.g., Ng et al., 1995;Vallenari et al., 2006), to the dwarf galaxies of the Local Group (e.g., Grebel, 1997;Mateo, 1998;Tolstoy et al., 2009), and more distant galaxies as long as their stars can be resolved (e.g., Crnojević et al., 2016). The formalism applies to systems dynamically governed by both collisional and collisionless dynamics.
Finally, a few further notable results of this paper are the following: • We use the mathematical concept of foliation to formalize the multiple-stellar population theory. So far, this is the only known way to reconcile both the time evolution of a distribution function used in classical dynamics theory, and the concept of SSPs typically used in the classical stellar population theory. In this way, we were able to retain the concept of the "CSP as the sum of SSPs," typical for a classical stellar population theory, as well as the time evolution of the distribution functions typically describing classical stellar dynamics.
• We introduced the most general Liouville theorem (see Eq. (1)) known so far for the conservation of the flux in a given space, and we considered its validity in the space E defining our concept of CSP. It contains a geometric factor (the metric tensor) that must be accounted for by the general manifold treatment of the existence-space introduced by the compositestellar population theory. The theory aims to set the basis for a unification of the classical stellar population theory and classical dynamics theory. This equation has no precedent astrophysical use and, therefore, should be of interest to anybody who wants to relate the dynamical "self-consistency" with the stellar population "consistency" to investigate the distribution function of a time-dependent stellar population.
• By introducing a "torii-function" (see Eq. (14)) we present the first fully analytical formulation of a polygonal function for an arbitrary number of segments (see Eq. (36)). The fact that the IMF requires three sections is just a particular case of Eq. (36), but it provides a general interpolation function between n arbitrary points.