Final Size for Epidemic Models with Asymptomatic Transmission

The final infection size is defined as the total number of individuals that become infected throughout an epidemic. Despite its importance for predicting the fraction of the population that will end infected, it does not capture which part of the infected population will present symptoms. Knowing this information is relevant because it is related to the severity of the epidemics. The objective of this work is to give a formula for the total number of symptomatic cases throughout an epidemic. Specifically, we focus on different types of structured SIR epidemic models (in which infected individuals can possibly become symptomatic before recovering), and we compute the accumulated number of symptomatic cases when time goes to infinity using a probabilistic approach. The methodology behind the strategy we follow is relatively independent of the details of the model.

of an epidemic is assessed using different indicators. Among these indicators, there are two that are particularly important. One of them is the final size of the epidemic (for those infectious diseases that are not endemic), which gives the total number of individuals that will become infected during an epidemic. The other is the basic reproduction number (Diekmann et al. 1990), which gives the expected number of secondary infections produced by a typical infected individual at the beginning of the epidemic (when basically the entire population is susceptible and the infected population grows exponentially). It is clear that both indicators depend on the particularities of the infectious agent and the host population. Prevention and/or control measures seek to modify these particularities so that the epidemic is as mild as possible.
In certain epidemiological models, the two previous indicators are related in the sense that one can be deduced from the other. For instance, in the SIR model given by the ODE system the basic reproduction number is R 0 = β/γ and the fraction of infected population at the end of the epidemic, denoted by π , is the unique solution of the equation This type of relationship has been found in other, more elaborate models than system (1) (see, for instance, Ma and Earn 2006;Arino et al. 2007;Diekmann et al. 2013;Inaba 2014;Magal et al. 2016Magal et al. , 2018Almeida et al. 2021). As suggested in Diekmann and Heesterbeek (2000) and explained in more detail in Miller (2012), for a relation of the form (2) to exist the infection probability from an individual to another must be independent of the moment in which the first individual becomes infected. The term "probability" in the above condition may suggest that this is a condition valid only for stochastic epidemiological models. The truth is, however, that it can be applied in deterministic models like system (1) simply by considering one of its possible underlying stochastic models (i.e. a stochastic process whose expected dynamics is explained by the deterministic system (1) when the total population is large enough).
When a significant portion of the infected population is asymptomatic, the impact of the epidemic that may be measured is more related to the number of individuals who will develop symptoms than to the final size of the infection. In these situations, therefore, it can be useful to study an indicator that gives the expected number of symptomatic cases that will occur throughout the epidemic, what we will call the final symptomatic size. Different models that distinguish between asymptomatic and symptomatic infected have been analysed, for instance, in Inaba and Nishiura (2008); Cushing and Diekmann (2016); Leung et al. (2018); Liu and Webb (2021); Barril et al. (2021b); Fitzgibbon et al. (2020). The aim of this article is to give analogous formulas to the relationship (1) in which the fraction of symptomatic population at the end of the epidemic (instead of the total fraction of infected) intervenes.
In Sect. 2 of this paper, the relationships between the basic reproduction number and the final symptomatic size are derived. In order to do this, we calculate the probability that a susceptible individual chosen at random at the beginning of the epidemic, what is called a test individual, ends up getting infected and showing symptoms. In order to introduce the ideas progressively, we distinguish three scenarios of increasing complexity: homogeneous populations (in which all individuals behave in the same way), populations where there is heterogeneity only among the susceptible, and populations where heterogeneity is present in both the susceptible and the infected individuals. Each of these scenarios is presented in its own subsection. In Sects. 3, 4 and 5, the results of Sect. 2 are applied to three examples covering the three possible scenarios. Specifically, in Sect. 3 we give an example of a homogeneous population in which the infected individuals are structured by their age of infection. We note that, despite the fact that the infected population is structured, the population is homogeneous since even though the behaviour of each infected individual varies throughout its life (it will be more or less infectious depending on the age of the infection), the way in which this behaviour varies is common to all infected individuals. Section 4 is devoted to an example of a heterogeneous population in which there are a finite number of different classes (typologies), both infected and susceptible. Section 5 generalizes this example by considering the susceptible and infected classes structured by a continuous variable.

Does a Test Individual Present Symptoms?
As it was shown in Miller (2012) (and beforehand suggested in Diekmann and Heesterbeek 2000), the proportion of infected individuals in an outbreak can be computed as the probability that an individual chosen at random is infected at some instant during the epidemic. Under certain conditions on the population structure, this probability can be expressed in terms of R 0 , that is, the expected number of new infections produced by an infected individual in a fully susceptible population. A systematic procedure to compute R 0 is based on the so-called next-generation operator, denoted by G, which gives the distribution of secondary infections as a function of the distribution of the primary infected individuals. It can be shown that R 0 coincides with the spectral radius of G (see Diekmann et al. 1990;Inaba 2017;Barril et al. 2018).
In order to give a formula not for the size of the infected compartment, but for the proportion of individuals that have presented symptoms, it is enough to multiply the probability that a test individual is infected by the probability that this individual manifests the disease. In the following, we will discuss some general scenarios in which these two probabilities can be computed explicitly.

Homogenous Population
Let us recall from Miller (2012) the derivation of the final infection size when all the individuals in the population are equivalent and the susceptibility level of an individual does not change during the epidemic (here by susceptibility level we mean the probability that a susceptible becomes infected per time unit and infected individual in the population). Notice that under this hypothesis, all susceptible individuals are equally likely to be infected by a random infected individual. That is, if we take a test individual from the susceptible population, then all infected individuals will infect this test individual with the same probability. On the contrary, if the susceptibility-level changes, then not all infected individuals will infect the test individual with the same probability. For example, if people change their social habits due to a high prevalence of the disease, then it will be more probable that the test individual gets the infection from an individual infected at the beginning of the epidemic (when people don't adopt prevention measures) than from an individual infected later when the prevalence is high (when people adopt prevention measures). A simple ODE system in which the susceptibility level changes is: where β is a decreasing function and N denotes the total population. As far as we know, it is not possible to derive the final infection size of this kind of systems (when the susceptibility level changes in time) without integrating the trajectories of the system (it is possible, however, to derive bounds of the final infection size taking into account the maximum and minimum values that β can take (Arino et al. 2007), and to derive analytical expressions for the final infection size in models in which the susceptibility level changes from one value to another permanently after the spread of the epidemics reaches some threshold (Gog and Hollingsworth 2021). This is why in this work we restrict ourselves to the case in which the susceptibility level of individuals is constant in time.
Let K (N ) denote the number of individuals that will become infected during an epidemic in a population of N individuals. In general, K (N ) is a random variable. Let us assume that The constant π , referred to as the final infection size from now on, represents the proportion of accumulated infected individuals at the end of the epidemics when the initial susceptible population is sufficiently large. The previous assumption means that, for large population sizes, the random variable can be approximated by π . More precisely, that for all ε > 0, the random variables Under this hypothesis, it can be shown (as an application of the Poisson limit theorem) that where Bin(n, p) denotes the binomial distribution with parameters n ∈ N and p ∈ [0, 1] and Poiss(λ) denotes the Poisson distribution with parameter λ > 0.
Since we are supposing that all infected individuals can infect a test individual u with the same probability, it follows that the probability that such a test individual becomes infected is since each infected individual (from the total K (N )) can infect the test individual u with a probability R 0 /N , where, as said before, R 0 is the expected number of secondary infections that a primary infected individual produces when the population is fully susceptible. Then, in the limit N → ∞, we have P(u infected) = π , and on the other hand due to assumption (3) via (4) The previous formula for the final infection size can be used to derive the number of symptomatic cases that the epidemic will cause. Indeed, if p sym denotes the probability that an infected individual presents symptoms, then the probability that the test individual u presents symptoms is The probability π sym coincides with the proportion of symptomatic individuals at the end of the epidemic. It can be shown that, if R 0 > 1, Eq. (5) has a unique positive solution, which we denote by π(R 0 ). Moreover, π(R 0 ) is an increasing function of R 0 , which means that the larger R 0 is, the larger the final infection size. As π sym is an increasing function of π , we also conclude that π sym is an increasing function of both R 0 and p sym .

Heterogeneous Susceptible Population
In more realistic situations, the susceptibility is not the same for all individuals because there are physiological or behavioural differences between them. For instance, a probability of infection higher than average could be observed in immunosuppressed people (a physiological trait) or in promiscuous people (a behavioural trait). Epidemiological models capture this heterogeneity by structuring the population into compartments or classes, describing the population by a density function with respect to the structuring variable (Almeida et al. 2021;Inaba 2014;Lorenzi et al. 2021;Peng and Zhao 2012;Wang and Zhao 2012).
When the susceptibility level of individuals does not change during the epidemic and all infected individuals are equivalent, a generalization of (5) can be derived (Miller 2012). Structuring the susceptibility classes according to a variable x, we can define R 0 (x) > 0 as the number of secondary infections an infected individual causes to susceptibles of type x. In particular, this definition implies that Let s 0 (x) be the normalized density of susceptibles within class x at the beginning of the epidemic (i.e. s 0 (x)N is the density of susceptibles of type x at that moment). Let K (N ) be, as in Sect. 2.1, the number of individuals that will become infected during the epidemic in a population of N individuals. Then, using again (4), the probability that a test individual of type x becomes infected (for large N ) can be expressed in terms of R 0 (x) as: since each infected individual (from the total K (N )) can infect the test individual of type x with probability R 0 (x) s 0 (x)N . Now we can define π s (x) so that N π s (x) gives the density of susceptible individuals of type x that will become infected at some point during the epidemic. Then, one has (interpreting P(u is of type x) as the probability density of u being of type x) and in particular where we have used that s 0 (x) = 1. This formula gives the final infection size of the epidemic. In order to know the final number of symptomatic cases, we have to introduce p sym (x) as the probability that a susceptible individual of type x presents symptoms after becoming infected, and then, the probability that a test individual of type x shows symptoms is Since, from (7), we obtain that the number of symptomatic cases that will be produced during the epidemic is: Therefore, if s 0 (x), R 0 (x) and p sym (x) are known, the fraction π sym can be obtained by solving first (8) in order to obtain π , and afterwards solving (10). Notice that, in addition to π and π sym , formulas (7) and (9) can be used to compute π s (x) and π sym s (x), which may give information on the susceptibles that are more vulnerable.

Heterogeneous Susceptible and Infected Population
Let us now consider heterogeneity in both populations, susceptible and infected. As before, let s 0 (x) be the density of susceptibles within class x at the beginning of the epidemic (i.e. s 0 (x)N is the number of susceptibles of type x at that moment). Let us define R 0 (x, y) > 0 as the number of secondary infections a primary infected individual of type y causes to susceptibles of type x. Notice that R 0 (x, y) depends on s 0 (x). Indeed, if there are no susceptibles of class x (i.e. if s 0 (x) = 0) then necessarily R 0 (x, y) = 0. Let us assume there is a finite number n of susceptible classes and a finite number m of infected classes. That is, if x and y denote the type of susceptible and infected individuals, respectively, then x ∈ {x 1 , x 2 , . . . , x n } and y ∈ {y 1 , y 2 , . . . , y m }. The class a susceptible belongs to is fixed, in the sense that its class remains the same until it becomes infected. The same is assumed for infected individuals: the class an infected individual belongs to is always the same until it recovers. In analogy to K (N ) in the previous subsections, let K (N , y i ) be the accumulated infected individuals of type y i when the epidemic ends, and let us assume that Here, π(y i ) can be interpreted as the proportion of accumulated infected individuals of type y i at the end of the epidemic when N is large enough.
In this case, the probability that a test individual of type x becomes infected is (using property (4) once more) In particular, if whenever a susceptible individual of type x j is infected, it becomes an infected individual of type y j (that is, if there is a one-to-one correspondence between classes of susceptible and infected, which implies m = n), one then has where π s (x j ) represents the fraction of susceptible individuals of class x j infected during the epidemic. More generally, if whenever a susceptible individual of type x j is infected, it becomes an infected individual of type y k with probability p x j →y k we have, for all k ∈ {1, . . . , m}, and in this case π s (x j ), for j ∈ {1, . . . , n}, can be expressed in terms of π(y k ) as Let us note that, for both the particular case (12) and the general case (13), in order to find the vector π π π := (π(y 1 ), . . . , π(y m )) we have to solve an equation of the form π π π = F(π π π) with F : R m → R m , where the image of F corresponds to the right-hand side of equations (13) for the different values of k, i.e. for w ∈ R m and k ∈ {1, . . . , m}, If F has a unique fixed point π π π with nonnegative entries and the sequence of iterates F l (π π π 0 ) → π π π when l → ∞ for all π π π 0 with positive entries, then (13), interpreted as a fixed point equation, gives a method to obtain the infection final size vector. The following result guarantees exactly that.
Theorem 1 Let R 0 be the reproduction number associated with the model considered. Then, • if R 0 < 1, then 0 is the only fixed point of F in the positive cone and F l (w) → 0 when l → ∞ for all w in the positive cone, • if R 0 > 1, then there exists a fixed point π π π = 0 of F in the positive cone such that F l (w) → π π π when l → ∞ for all w satisfying If π π π is the only nonzero fixed point of F in the positive cone, then F l (w) → π π π when l → ∞ for all w = 0 of the positive cone.
Proof See Appendix.
Generically, when R 0 > 1, F has only one nonzero fixed point in the positive cone. The degenerate cases with multiple nonzero fixed points correspond to scenarios in which the infection may not be able to spread to all infected classes. This occurs, for instance, if the secondary cases caused by an infected individual are always of the same class as the primary infection. In this case, there are multiple final infection sizes depending on the initial distribution of infected individuals. From now on, we assume that F has only one nonzero fixed point.
The vector π π π = (π(y 1 ), . . . , π(y m )) then gives the fraction of infected individuals of different types at the end of the epidemic and allows to compute the proportion of infected individuals in the population at the end of the epidemic, since Let us note that as expected since the sum of infected individuals must coincide with the sum of all susceptible individuals that became infected. The previous arguments cannot be applied when the set of infected classes is not finite. If the possible infected classes form an open set of an Euclidean space, denoted by Y , then K (N , y) could be 0 almost surely for all y ∈ Y . To address this problem, instead of K (N , y) one must consider K (N , ω) defined as the accumulated number of infected individuals of types y ∈ ω ⊂ Y at the end of the epidemics (where ω is a Lebesgue measurable set), and assumption (11) should be replaced by in probability for all Lebesgue-measurable ω ⊂ Y . In this case, π(ω) corresponds to the proportion of accumulated infected individuals with types y ∈ ω at the end of the epidemic when N is large enough. From now on, let us restrict ourselves to models in which the function π can be written in terms of an integrable functionπ : Y → R as Notice that if Y = [0, 1], such a functionπ exists provided the mapping y → π([0, y]) (which is increasing) is continuous. The fraction of infected population at the end of the epidemic when N tends to infinity (i.e. the final infection size) is Consider now a random variable Y with density Then, since the accumulated number of infected individuals when the epidemic ends for all x ∈ X , where X is the set of all susceptible classes, s 0 (x) is the initial distribution of susceptibles and Y i are independent random variables with density given by (16). Notice that, when x takes values in a continuum, then both s 0 (x)N and R 0 (x, y) could be densities of individuals with respect to the susceptible structuring variable x (and not individuals as in the case of a finite number of susceptible classes).
In order to rewrite the right-hand side of (17) in terms ofπ, notice that In particular, if the moments of R 0 (x, Y ) are finite (for all x ∈ X ), we have Therefore, with an argument analogous to the one used in the finite case, if a susceptible individual of type x, once infected, becomes an infected individual of type y with probability 1, we have (interpreting P(u is of type x) as the probability density of u being of type x) In general, if p i (x, y) denotes the probability density that a susceptible individual of type x, once infected, becomes an infected individual of type y, theñ From this integral equation, one could, at least numerically, findπ (y) (and in particular the fraction π = Yπ (y)dy of infected individuals during the epidemic). In order to be able to compute the fraction of symptomatic cases, one more ingredient should be introduced in the problem, meaning the probability that an infected individual of type y formerly susceptible of type x develops symptoms. Let p sym (x, y) denote this probability. Then, π s (x) p i (x, y) p sym (x, y) will be the distribution of symptomatic individuals with respect to the infected and susceptible types, and the total fraction of symptomatic individuals during the epidemic will be In the particular case that the probability of having symptoms only depends on the class to which the infected individual belongs, i.e., if p sym (x, y) = p sym (y), then, using (19) the following identity holds for π sym π sym = Yπ (y) p sym (y)dy.
In the case of finite classes of both susceptible and infected individuals, formula (20) reduces to In fact, one could define in such a way that π sym s (x i ) is the fraction (with respect to the total population N ) of susceptibles in class x i that will end up developing symptoms and π sym (y j ) is the fraction (also with respect to the total population N ) of individuals that will end up showing symptoms and that during the asymptomatic phase were infected individuals of type y j .
When the classes of susceptible and infected individuals are equivalent (that is, if n = m and p x i →y j = 1 if i = j and 0 otherwise, implying π s (x i ) = π(y i )), one has π sym s (x i ) = π sym (y i ).
Moreover, in this case the probability of an infected individual developing symptoms does not depend on its type as susceptible (since this is already determined). Thus,

Age of Infection Model
Let us consider the following age of infection structured model considering both asymptomatic and symptomatic individuals: where the state variables S, i and J represent the density of susceptibles, infected asymptomatic and infected symptomatic individuals, respectively. The total population is denoted by N and is constant in time. The infected asymptomatic individuals are structured by the age of infection, denoted by τ , i.e. the time that has passed since the individual became infected. The parameters β 1 and β 2 are the transmission rates of asymptomatic and symptomatic individuals, respectively, and γ 1 and γ 2 are the recovering rates of asymptomatic and symptomatic individuals, respectively. Infected individuals that reach age of infection T develop symptoms with probability p. Therefore, an infected individual can recover without presenting symptoms if the recovering happens before the age of infection attains T or if at age of infection T it recovers with probability 1 − p. For more details, see Barril et al. (2021b). Although in this system the infected individuals are structured by the age of infection, the system can be analysed as if there were no differences between the infected individuals. This is so because all infected individuals have the same properties. Therefore, following the homogeneous population formalism described in the previous section, in order to determine the final infection size and the final symptomatic size, denoted by π and π sym , respectively, we must compute the reproduction number associated with the model (R 0 ) and the probability that an infected individual presents symptoms at some point ( p sym ).
The probability p sym can be computed noticing that for an infected individual to become symptomatic, it should not recover before age of infection T , and then, it should present symptoms, which occurs with probability p. Since γ 1 (τ ) is the recovering rate of asymptomatic individuals with age of infection τ , the probability that an asymptomatic individual does not recover before age of infection T is: Now we compute R 0 , understood as the expected secondary asymptomatic cases produced by an asymptomatic primary case. The expression for R 0 was given but not derived in Barril et al. (2021b). In order to compute it, we follow the formalism developed in Diekmann et al. (1990) (see also Barril et al. 2018Barril et al. , 2021a where R 0 is obtained as the spectral radius of the so-called next-generation operator. Specifically we linearize system (23) around the disease-free steady state (N , 0, 0) to obtain the following equations for the dynamics of the infected population: Then, defining as an infection event the moment in which an individual becomes asymptomatic we can consider the following birth/infection operator and the mortality/transition operator The space δ 0 is the space generated by a Dirac delta centred at 0. Due to the fact that the range of B is not a subspace of its domain, i.e. δ 0 L 1 (0, T ) × R, R 0 cannot be defined as the spectral radius of the next-generation operator defined by B M −1 . However, as shown in Barril et al. (2021a), in the present case R 0 can be computed as: where G M (·, 0) is the Green function of the operator M associated to the impulse (δ 0 , 0). In particular, sinceB is continuous, then R 0 can be expressed as Therefore, to compute G M (·, 0), first we determine the preimages of ϕ k = (î k ,Ĵ k ) ∈ L 1 (0, T ) × R under the linear operator M with domain given in (24). To do that, let us consider which implies, by applying M to both sides and using that i k (0) = 0 (because (i k , J k ) ∈ D M ), With these expressions we finally have, defining 1 (τ ) := e τ 0 γ 1 (s)ds , The reproduction number is then: which is meaningful from the biological point of view since the right-hand side of the expression above can be interpreted as the expected secondary infections produced by an infected individual during the asymptomatic phase plus the probability that an infected individual presents symptoms, that is p sym = p/ 1 (T ), multiplied by the expected secondary infections produced by a symptomatic individual, that is β 2 /γ 2 . Once the expressions for R 0 and p sym in terms of the parameters of the model are determined, equations (5) and (6) can be used to compute π and π sym , that are Remark Notice that π sym does not satisfy π sym = 1 − e −R 0 π sym whereR 0 denotes the secondary symptomatic cases produced by a primary symptomatic individual in a fully susceptible population. This quantity is computed (for the model above) in Barril et al. (2021b) and is given bỹ This means that formula (5) is not valid for arbitrary definitions of what an "infection event" is, for instance, when considering that the "infection event" occurs when an individual starts presenting symptoms. The reason why (5) fails in this case is that the test individual may become immune without ever entering the symptomatic compartment. Specifically, the first symptomatic individual may cause the test individual to become immune without becoming symptomatic, and in this case the probability that the second symptomatic individual (and the third, the fourth, etc.) causes that the test individual presents symptoms is zero (but the test individual has never presented symptoms!). That is, the probability that a random symptomatic individual causes the symptoms of the test individual depends on the presence of other symptomatic individuals: it isR 0 /N if there is only one symptomatic individual throughout the epidemic, but it is going to be smaller thanR 0 /N if number of accumulated symptomatic individuals is bigger than one. In fact, such a probability decreases with the accumulated number of symptomatic individuals at the end of the epidemics, denoted byK (N ), and this prevents us from following the reasoning of section 2.1 since then

Model with Individual Heterogeneity (Finite Number of Classes)
Let us consider an epidemic model with individual heterogeneity where the susceptible and infected populations are structured by the discrete variables x ∈ {x 1 , . . . , x n } and y ∈ {y 1 , . . . , y n }, respectively, and is described by the dynamical equations for i ∈ {1, . . . , n}.
Here S i (t) = S(t, x i ) denotes the susceptible population with state x i at time t, whereas I i (t) = I (t, y i ) and J i = J (t, y i ) denote, respectively, the asymptomatic and symptomatic populations with state y i at time t. The parameter p i = p(y i ) denotes the probability per unit of time that an infected individual with state y i develops symptoms (it corresponds to the rate at which an infected individual presents symptoms), and γ i = γ (y i ) and γ sym i = γ sym (y i ) the recovery rates of asymptomatic and symptomatic individuals with state y i , respectively. The parameter β i j = β(x i , y j ) is the transmission rate between asymptomatic individuals with state y j and susceptible individuals with state x i , and β sym i j = β sym (x i , y j ) stands for the transmission rate between symptomatic individuals with state y j and susceptible individuals with state x i . Notice that the previous equations imply that after infection, a susceptible of type x i becomes an infected individual of type y i . Notice also that we are implicitly assuming that infected individuals start by being asymptomatic and only after some time may become symptomatic.
The dynamics of the epidemic is supposed to be sufficiently fast, and therefore, the demographic processes are not considered. Let N be the total population at the beginning of the epidemic. There is a continuum of disease-free steady states, which are Linearizing around one of these disease-free steady states, we obtain the following equations for the dynamics of the infected population: Clearly, the state space of the linearized system is R n × R n . Let us use a basis of this space so that the states are written as pairs of n-tuples: ((I 1 , . . . , I n ) , (J 1 , . . . , J n ) ).
Defining as an infection event the moment in which an individual becomes asymptomatic we can consider the following birth/infection operator B and mortality/transition operator M.
where D(v) denotes a diagonal matrix whose diagonal entries are given by vector v, where s 0 , γ , γ sym and p denote vectors whose ith entries are, respectively, s 0 (x i ), γ i , γ sym i and p i , and where β and β sym are matrices whose entries (row i, column j) are β i j and β sym i j respectively.
With these two operators, we obtain the so-called next generation operator (Diekmann et al. 1990;Inaba 2017;Barril et al. 2021b) where, with slight abuse of notation, for any vector v with positive components, 1 v denotes the vector whose components are the inverse of the components of v.
With the next-generation operator, we can compute the number of secondary infections an infected individual of type y j causes to susceptibles of type x i . It is enough to apply this operator to the jth basis vector of R n × R n , namely (e j , 0) (this represents the situation in which there is only one infected asymptomatic individual of type y j , i.e. I j = 1 and I i = 0 for all i = j and J i = 0 for all i). The ith index of the first half of the resultant vector B M −1 (e j , 0) gives the number of secondary cases of type y i produced by the primary infected individual of type y j (notice that B M −1 (e j , 0) has 2n components, but that the second half of the vector is full of zeros due to the assumption that new infected individuals are always asymptomatic). Since all infected individuals of type y i were susceptible of type x i , we conclude that the number of secondary infections an infected individual of type y j causes to susceptibles of type x i is the scalar product of (e i , 0) by B M −1 (e j , 0) , i.e., The final infection size of the different classes, i.e. π π π = (π(y 1 ), . . . , π(y n )) is a fixed point of equation (12), that is π π π = F(π π π) (26) Fig. 1 Comparison between the theoretical result (dashed lines) and the numerical result obtained by integrating system (25) of Sect. 4 (continuous lines). The left plot shows the infected size of a system with 5 different classes with respect to time (the dashed lines represent π π π ), and the right plot shows the symptomatic size of the same system and the same classes (the dashed lines represent π π π sym ). Notice that for this particular example the class that has the largest final infected size (the class represented in purple) is not the class that has the largest final symptomatic size (the class represented in red). Adding the different dashed lines we would obtain the total final infected size π (in the left) and the total final symptomatic size π sym (in the right). The parameters of the simulation are (following the notation in the main text): n = 5, s 0 ( Once we know the final infection size of each class, we can compute the final symptomatic size. To do this, we have to determine the probability that an infected individual of type y i presents symptoms. Since the recovery rate of this individual is γ i and the rate of presenting symptoms is p i , the probability that this individual presents symptoms is Consequently, the final symptomatic size of infected individuals of type y i (that coincides with the final symptomatic size of susceptible individuals of type x i because the classes of susceptible and infected individuals coincide, i.e., π sym (y i ) = π sym s (x i )) is Once π π π and π π π sym = (π sym (y 1 ), . . . , π sym (y n )) are determined, the final infected and symptomatic sizes can be computed using (15) and (22). In Fig. 1 , we compare the theoretical and numerical results of a specific example of system (25).

Model with Individual Heterogeneity (Continuous Trait)
Let us consider the continuous extension of the model presented in the previous section. Specifically let us assume that individual heterogeneity is expressed by a continuous variable x taking values in = [0, 1], and let us consider the equations where s(t, x) denotes the susceptible, i(t, x) the asymptomatic and j(t, x) the symptomatic population density with state x at time t. As a consequence, the phase space of the above system is set to be L 1 (0, 1) 3 . Here p(x) denotes the probability of developing symptoms, and γ 1 (x) and γ 2 (x) the recovery rates of asymptomatic and symptomatic individuals, respectively. β 1 (x, y) is the transmission rate between asymptomatic individuals with state y and susceptible individuals with state x, and β 2 (x, y) stands for the transmission rate between symptomatic individuals with state y and susceptible individuals with state x. As before, the dynamics of the epidemic is supposed to be sufficiently fast, and therefore, the demographic processes are not considered and N is the total population at the beginning of the epidemic. Linearizing around a disease-free steady state (N s 0 (x), 0, 0), with 1 0 s 0 (x)dx = 1, we obtain the following equations for the dynamics of the infected population: The birth/infection operator B and mortality/transition operator M (unlike the example of Sect. 3 here both operators are bounded, i.e. they are defined in all L 1 (0, 1) 2 ) are and the next-generation operator is then Herex andŷ have been used (instead of x and y) in order to avoid a notational collision in formula (29) below.
With an analogous argument as in the previous section, we obtain the "continuous" version of equations (26) and (27). Specifically, the final infection density is a solution of the functional equation with F : L 1 (0, 1) → L 1 (0, 1) defined as (recall (18)) ( p(y)+γ 1 (y))γ 2 (y) π(y)dy s 0 (x) since R 0 (x, y) is given by Finally, since the probability that an infected individual of class y presents symptoms is: the final density of symptomatic cases, structured by the variable y, is π sym (y) =π(y) p sym (y) =π(y) p(y) p(y) + γ 1 (y) . and the total number of symptomatic cases at the end of the epidemic is given by (whenever the solutionπ of (28) can be obtained) as π sym = π(y) p(y) p(y) + γ 1 (y) dy.

Conclusion
In this article, we used the probability for a test individual to become infected and develop symptoms to compute the final number of symptomatic cases that the epidemic will cause. The obtained equations relate the final symptomatic cases with the reproduction number of the epidemics and the probability/rates at which infected individuals present symptoms. The equations are, therefore, natural generalizations to the well-known final infection size equations. It has been discussed elsewhere (see, for instance, Cushing and Diekmann 2016) that basic reproduction numbers depend on how an "infection event" is defined. From a biological point of view, it makes sense to consider that individuals become infected as soon as the infectious agents start to proliferate within them (a biological infection event). However, in practical situations most infected individuals are not reported in the course of an epidemic unless they present symptoms at some time, so that in some sense the onset of symptoms is what defines the "infection event" in epidemiological data (an epidemiological infection event). In this article, R 0 is always associated with the biological infection event definition, which explains why the probability of presenting symptoms appears as an independent element in the final symptomatic cases equation. A natural question (and what motivated part of this work) is what happens when adopting the epidemiological infection event definition. More precisely, taking into account that π = 1 − e −R 0 π is an equation for the final infection size in the homogeneous case (Sect. 2.1), does an analogous formula for the final symptomatic cases hold when the reproduction number is computed according to the epidemiological infection event definition? That is, denoting byR 0 such an alternative reproduction number (which gives the secondary symptomatic cases produced by a symptomatic individual), then does π sym = 1 − e −R 0 π sym hold? The answer is no (as shown in the remark at the end of section 3). The reason is that the test individual argument used to deduce (5) cannot be applied in the same way considering only the symptomatic population andR 0 . The difference in this setting is that the test individual may gain immunity (i.e. recover) at some point without having been part of the symptomatic population (while in the reasoning leading to (5) the test individual stays susceptible until it, eventually, becomes part of the infected population).
This observation has important implications on how the final infection size is computed with the available information. Indeed, if Eq. (5) is used plugging the reproduction number associated with symptomatic individuals, the result would neither be the final size of symptomatic cases nor the final infection size.
Although adding realism in terms of population structure, the formalism presented here still relies on important simplifying assumptions, such as the time-independence of the susceptibility level discussed in section 2.1. Recent work has analysed the final size for situations where there is a permanent reduction in mixing intensity (and hence a reduction in the susceptibility level) that depends on the epidemic progression (Gog and Hollingsworth 2021). Although the test individual trick does not seem to be generalizable in a straightforward way in time-dependent scenarios, it is worth studying if the trick can be extended at least to the simple model considered in Gog and Hollingsworth (2021), for which final infection size formulas do exist. If this were the case, maybe these formulas could be generalized to heterogeneous population (with possibly asymptomatic individuals) proceeding as we do in this paper.
Acknowledgements C.B. and S.C. have been supported by Grants PID2021-123733NB-I00 and 2021 SGR 00113. P.-A. B. is especially grateful to C.B. and S.C. for their proposal to work on this study.
Funding Open Access Funding provided by Universitat Autonoma de Barcelona.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix A Proof of Theorem 1
To prove Theorem 1, we follow the ideas in Magal et al. (2018Magal et al. ( , 2016, where a similar problem is addressed. The results stem from the theory of monotone discrete dynamical systems (see section 5 in Hirsch and Smith 2005). First we define the relation ≤ between elements of R m as . . . , m}, and the interval [u, v] as the set of all w ∈ R m satisfying u ≤ w ≤ v. In addition, we consider the three following general results.
, and applying this argument inductively it follows F l (u 0 ) ≤ F l (w) ≤ F l (v 0 ) for all l ∈ N. Now letw be an accumulation point of the orbit of w by F, i.e. there exist an unbounded subset of indices {l k } k∈N such that lim k→∞ F l k (w) =w, which implies Let F be as in Proposition 2 and let u and v be fixed points of F. A doubly infinite sequence {x n } n∈Z in R n is called an entire orbit from u to v if Although the dynamical behaviour of the third fixed pointw is not stated in Corollary 5.12 of Hirsch and Smith (2005), notice that at least one fixed point in [u, v] will not be locally asymptotically stable. To see this, let A be the set of all fixed points of F in [u, v] which are locally asymptotically stable. Since [u, v] is compact, A is necessarily finite. Otherwise there would exist an accumulation point in [u, v] of locally asymptotically stable fixed points, which would imply that such point would converge to infinitely many different points, which is impossible. Choose v ∈ A such that the only elements of A in [u, v ] are u and v . Then, Corollary 5.12 of Hirsch and Smith (2005) ensures the existence of a third fixed pointw ∈ [u, v ] ⊂ [u, v], andw is not locally asymptotically stable becausew / ∈ A by construction. Before proceeding further, let us see how R 0 is related to the mapping given by F in (14). Recall that R 0 can be defined as the spectral radius of the next-generation operator G. This operator, which is a matrix when there are only finitely many classes of infected individuals, gives the distribution of secondary infections as a function of the distribution of primary infected individuals. In particular, if e i denotes the ith canonical vector, the jth entry of the image vector Ge i (which coincides with the entry ( j, i) of the matrix G, i.e. G j,i ) gives the expected number of secondary infected individuals of class y j that a primary infected individual of class y i produces. Taking this observation into account, it is easy to construct the matrix G using the values R 0 (x k , y i ) and p x k →y j . Indeed, since R 0 (x k , y i ) gives the expected number of susceptible individuals of class k infected by a primary infected individual of class i and p x k →y j gives the probability that when a susceptible of class k is infected it becomes an infected individual of class j, it turns out that G j,i = n k=1 R 0 (x k , y i ) p x k →y j .
We conclude, therefore, that R 0 = ρ(G) where G is the matrix whose entries are defined in (A1).
On the other hand, notice that 0 is a fixed point of F. It is well known that the local behaviour of 0 is determined by the spectral radius of D F(0) (it is locally asymptomatically stable if ρ(D F(0)) < 1 and it is unstable if ρ(D F(0)) > 1). A quick computation shows that D F(0) = G, which implies that 0 is locally asymptomatically stable if R 0 < 1 whereas it is unstable if R 0 > 1. To prove that 0 is in fact a global attractor of the positive cone when R 0 < 1, we need the following results: Proposition 5 Let F be defined in (14). Then, F is component-wise increasing, i.e. F(u) Proof Notice that ∂ i F j (w) ≥ 0 for all w ∈ R m and i, j ∈ {1, . . . , m}. (14). Let u, v ∈ R m . If 0 ≤ u ≤ v, then ρ(D F(u)) ≥ ρ (D F(v)).

Proposition 6 Let F be defined in
Proof First notice that the entry ( j, i) of D F(u), i.e. ∂ i F j (u), is smaller than the entry ( j, i) of D F(v), i.e. ∂ i F j (v). This follows from the fact that ∂ ki F j (w) ≤ 0 for all i, j, k ∈ {1, . . . , m} and w ∈ R m . Moreover, all entries of D F(u) and D F(v) are nonnegative. Therefore, D F(u) l ≥ D F(v) l for all l ∈ N, and applying the Gelfand's formula we finally conclude ii. for all w ≥v, lim l→∞ F l (w) =v.
Proof To prove i. notice that F l+1 (v 0 ) ≤ F l (v 0 ) for all l ∈ N, so that the sequence {F l (v 0 )} l∈N is decreasing component-wise. Since the orbit of v 0 is bounded (notice that 0 ≤ F l (v 0 ) ≤ v 0 for all l ∈ N), it follows that the sequence {F l (v 0 )} l∈N converges to a pointv ∈ [0, v 0 ] ⊂ R m . Since F is continuous,v is a fixed point of F. To prove ii., notice that for all w ≥ v 0 one has F(w) ≤ v 0 . Then, since F is componentwise increasing and v 0 ≥v and lim l→∞ F l (v 0 ) =v = lim l→∞ F l (v), by applying Proposition 2 it follows that the accumulation points of {F l (w)} l∈N belong to [v,v] = {v}, which implies that {F l (w)} l∈N converges tov.
As a corollary of these results, it follows that: Proposition 8 If R 0 < 1 then lim l→∞ F l (w) = 0 for all w ∈ R m + , with R + := (0, ∞). Proof This is shown by noticing that in this case thev of Proposition 7 has to be necessarily 0. Indeed, ifv > 0, then by Proposition 4 we have that there exists w ∈ [0,v] which is not locally asymptotically stable, but by Proposition 6 it follows 1 > R 0 = ρ(D F(0)) ≥ ρ(D F(w)), which contradicts this fact.

Proposition 9
If R 0 > 1, then lim l→∞ F l (w) =v = 0 for all w >v. Moreover, if 0 andv are the only fixed points of F in the positive cone of R m , then lim l→∞ F l (w) = v = 0 for all w ∈ R m + Proof Since the basin of attraction ofv contains an open set (namely all points w ∈ R m satisfying w ≥v), necessarily ρ(D F(v)) ≤ 1. Therefore, since R 0 = ρ(D F(0)) > 1, it follows thatv = 0. Applying Proposition 7, we conclude the first part of the statement.
In order to show that the interior of the positive cone of R m belongs to the basin of attraction ofv when 0 andv are the only fixed points of F in the positive cone, first notice that by Proposition 3 there is an entire orbit from 0 tov (since the existence of an entire orbit fromv to 0 would imply ρ(D F(0)) ≤ 1). Therefore, for all w ∈ R m + we can take w − ∈ (0,v] from that entire orbit and w + ≥v such that w ∈ [w − , w + ]. Then, since lim l→∞ F l (w − ) =v = lim l→∞ F l (w + ), by Proposition 2 we conclude lim l→∞ F l (w) ∈ [v,v] = {v}.