A POPULATION MODEL CAPTURING DYNAMICS OF TUBERCULOSIS GRANULOMAS PREDICTS HOST INFECTION OUTCOMES

Granulomas play a centric role in tuberculosis (TB) infection progression. Multiple granulomas usually develop within a single host. These granulomas are not synchronized in size or bacteria load, and will follow different trajectories over time. How the fate of individual granulomas influence overall infection outcome at host scale is not understood, although computational models have been developed to predict single granuloma behavior. Here we present a within-host population model that tracks granulomas in two key organs during Mycobacteria tuberculosis (Mtb) infection: lung and lymph nodes (LN). We capture various time courses of TB progression, including latency and reactivation. The model predicts that there is no steady state; rather it is a continuous process of progressing to active disease over differing time periods. This is consistent with recently posed ideas suggesting that latent TB exists as a spectrum of states and not a single state. The model also predicts a dual role for granuloma development in LNs during Mtb infection: in early phases of infection granulomas suppress infection by providing additional antigens to the site of immune priming; however, this induces a more rapid reactivation at later stages by disrupting immune responses. We identify mechanisms that strongly correlate with better host-level outcomes, including elimination of uncontained lung granulomas by inducing low levels of lung tissue damage and inhibition of bacteria dissemination within the lung.


Introduction
Tuberculosis (TB) is a disease caused by inhalation of Mycobacterium tuberculosis (Mtb). It mainly infects the lungs of adults, but other organs such as brain and kidneys can become infected as infection progresses, with bacteria disseminating via blood (extrapulmonary TB). A third of the world population is estimated to be infected with Mtb [38]. However, most people (about 80%) are able to control infection and are classified as having latent TB infection (or LTBI). Those with LTBI have a lifetime reactivate rate of 10%, which increases to a rate of 10% per year if individuals are co-infected with HIV-1 or other immunosuppressive scenarios [37,11]. Patients with LTBI cannot transmit the disease. The remaining 20% of individuals are not able to control infection and progress to active TB disease, which, if left untreated leads to death within a year or two. Treatment requires a long course of multiple antibiotics. What remains an open question in TB is why the course of infection differs from person to person leading to these different disease trajectories.
Mtb is an ancient pathogen (Egyptian mummies have been found that have TB granulomas that have calcified) and has evolved unique ways to avoid being cleared by the host. Resident macrophages in the lungs take up inhaled particles via phagocytosis to keep lungs clear of foreign matter. However, Mtb grows inside macrophages; it is their cell of choice with optimal growth conditions [42]. During both latent and active Mtb infection, structures called granulomas form at sites of infection, primarily in lungs. A granuloma is a collection of host immune cells such as macrophages, T and B lymphocytes and neutrophils, as well as bacteria [19]. Granulomas are considered to be a protective structure that immunologically restrain and physically contain bacteria. Recently it has been suggested that bacteria might utilize this structure to avoid being cleared, promote proliferation, and even facilitate dissemination [41]. Therefore, the granuloma is a highly dynamic structure reflecting an ongoing battle between host and pathogen.
A significant barrier to studying granulomas is that they are in lungs, and human lung tissue is not readily accessible. Recent findings have shown that in non-human primates (NHPs, the best animal model to study human TB), different granulomas, each seeded by a single mycobacterium, can have heterogeneous fates over time [25]. For example, in one monkey with active disease there were 18 granulomas, 6 of which were completely sterile and 2 with over a million bacteria each. In another monkey with latent infection there were 5 granulomas, 3 of which were sterile and 2 with bacteria levels ranging from 10 3 -10 5 [24]. Granulomas do not form synchronously-after infection a few may form on the same time scale, but others get seeded later or are formed from disseminating granulomas that are present. These findings imply that infection outcome may be determined by the dynamics of the collection of granulomas within an individual. Granulomas are also found in lymph nodes (LN) in humans and NHPs [7,26]. As LNs are the site where adaptive immune responses against Mtb are initiated, this could have important implications for the hostresponse and infection dynamics.
At present, there is no way to correlate counts of granulomas with infection status and disease progression. We have published extensively using mathematical and computational models to understand the dynamics in-host of immune cells and bacteria interactions in TB. We have used ODE models to describe total populations in lungs over time [46,32], and developed agent-based models to capture spatiotemporal dynamics of individual granuloma formation and function [43,17,30,33,15,16,8]. Others have studied this as well [29,28,4]. However, these models do not bridge to the host scale as each captures only a portion of what occurs in the host that leads to an infection outcome. In addition to immune modeling, we and others have built models that track the dynamics of TB epidemics in many different populations and under various scenarios [22,35,36,44,48,18,6,21,5,3,14,34].
Here we combine these approaches by using a standard SEIR population model to capture in-host features to track the epidemic evolving within an infected host. By adapting these epidemiological models to capture the dynamics of a "population" of granulomas in the entire body, we can bridge the two scales of lung granuloma and host to not only understand the role that lung and LN granulomas play in TB disease progression, but also to potentially foster a prediction tool for patient level disease trajectory. Such knowledge could guide development of advanced and personalized treatment strategies, and eventually help to control the global TB epidemic more effectively.

Granuloma in lung and LN
We constructed our mathematical model based on standard SEIR model formulations capturing a TB epidemic in human populations. In our model, granulomas form in two potential infection sites: the lung (species with subscript P, pulmonary) and LNs (species with subscript L) ( Figure 1).
We treat the collection of granulomas of each state in both organs. Black arrows indicate transitions between states. In lung tissue, granuloma are in one of four states: Susceptible (S P ) represents healthy tissues that are potential sites for granulomas but are not yet seeded with bacteria; Exposed (E P ) represents granulomas that are able to contain or control bacteria growth and prevent dissemination of bacteria; Infected (I P ) represents granulomas that are unable to control bacteria proliferation and can disseminate and infect healthy tissue. Finally, a compartment called Dead (D P ) represents tissue that is cleared of bacteria but is unhealthy and damaged from the immune response and is unable to recover due to necrosis, caseation, fibrosis and/or calcification processes that occur naturally during infection. Because lung tissue regeneration in adult humans is minimal, we assume a fixed total number of sites in the lung [40]. Populations in the LNs are similar but not identical. LN tissues do regenerate, and dead tissue gets cleaned-up more effectively [39]. As a result, we introduce turnover to healthy sites (S L ) in LNs and do not track dead tissue; other conversions between states are similar for both compartments. Exposed sites (contained granulomas) can recover or progress and allow uncontrolled bacteria proliferation and are classified as infectious (I P or I L ), disseminating to other susceptible sites in the lung (S P ) and LN (S L ).

Adaptive immunity
The adaptive immune system (f I in Figure 1) is necessary for the body to clear or contain Mtb infection. To successfully generate a potent adaptive immune response, three factors are essential (dotted arrows): (1) antigen presentation events, (2) availability of Mtb-specific immune cells (e.g. naive T cells), and (3) an environment for immune cells to recognize the antigen and be primed (a healthy LN in this case). We combine these three components into a coarse-grained representation of an immune response. We use a simplified version of each based on our previous modeling studies that capture immune dynamics during Mtb infection [23,47,20,27]. Each is described in more details below.
Antigen presentation-Antigen is defined as a substance (usually as short proteins fragments called peptides) that the immune system recognizes as foreign. Mtb antigen must be "presented" by antigen presenting cells (APCs, such as macrophages and dendritic cells) to T cells to initiate and sustain a specific immune response to Mtb, i.e. the generation of cells that respond to the infection. Antigen presentation can be coarsely represented as a combination of numbers of contained and disseminating granulomas in lung and LNs, weighted by the accessibility (p and q) of antigenic materials from these granulomas to the APCs.
Immune cells-Mtb-specific immune cells, primarily T cells, are required for successful immune responses. Each T cell responds to one and only one antigen and there are a plethora of antigens that comprise Mtb; however, it remains unknown which key antigens that should be responded to in order to provide protection. We assume the initial maximum T cell response to be at its full potential (1.0 unit), and for long-term immune dynamics, we use a time-dependent decreasing function to capture the decline of immunity and available immune cells over the lifetime of the host [47]. As TB is a disease that can last for decades, these two time-scales are relevant and appropriate.
Lymph node environment-There are approximately 700 LNs in humans. Only a small fraction (<10) are lung-draining LNs that typically participate in infection dynamics. APCs from the lung migrate to one of the lung-draining LNs to initiative an adaptive response in that LN, but may also seed infection in the LN by trafficking Mtb. LNs provide an environment for immune cells to recognize Mtb antigen presented by APCs and be primed to return to the site of infection (lungs) to participate in the immune response, which for TB means granuloma formation and function. LN status will be evaluated by the fraction of LN sites not yet hosting granulomas, i.e. S L /S L0 , where S L0 is the total number of LN uninfected sites available prior to infection. This specific immune response influences granuloma development in multiple ways and at multiple stages. In the model, immunity affects the following processes ( Figure 1, dashed arrows): (1) it can induce recovery of contained granulomas (E) to be resolved and return to a state of healthy tissue (S), (2) it reduces the cases of new infection (E and I from S) caused by disseminating granulomas, (3) it can inhibit contained granulomas from disseminating (E to I), and (4) it can aid elimination of disseminating granulomas (I to D) ( Figure 1). These influences are captured as parameters that serve as multiplicative factors adjusting rates in the model (see below).

Model equations
Following typical SEIR dynamic modeling, the transitions between these states are represented by the following set of ODEs: and (7) where (8) β PP and β PL are the rates of a healthy site in the lung infected with Mtb and forming a granuloma from a disseminating GR in lung or LN. β LP and β LL are the rates of a healthy site in LN being infected with Mtb and a granuloma forming from a disseminating GR in lung or LN. Infection is inhibited by adaptive immunity, as the rate is multiplied by a modifier term (K I is a constant for dissemination inhibition): (9) Newly infected sites will progress to either a contained (E) or disseminating (I) state, depending on various environmental factors, and we use a parameter ϕ to determine the probability of a GR site progressing to contained state as opposed to disseminating state upon new transmission. δ is the rate that a contained GR heals via a functional immune response (f I ) and reverts to susceptible tissue again. k is the rate that contained (E) GR progress to a disseminating (I) GR without any immune response. γ is the rate of Gong et al. Page 5 disseminating GRs being cleared by adaptive immunity. Contained and disseminating GRs might be cleared by mechanisms other than adaptive immunity, so we capture these mechanisms with a constant fractional rate μ E or μ I , which should be small compared with δ and γ. For δ, k, γ, μ E and μ I , subscript P or L indicate that the process happens in the lung or LN, respectively. S P0 and S L0 are the total number of initial susceptible sites in the lung and LN, respectively. Unlike in LNs where we explicitly capture the regeneration of healthy LN tissues, In the lung, the sum of sites of all four states is a constant: (10) f I is the overall adaptive immune response factor. It is the product of three factors: Antigen availability, time-dependent immunity potential, and the intactness of the LNs. It is calculated with Equation (11). The range for f I is between 0 and 1: (11) Ag represents the total amount of antigen available to the adaptive immune system. It is calculated by the sum of weighted numbers of contained and disseminating GRs in the lung and LN. p and q are larger than 1, assuming that antigens in a disseminating GR are more accessible than in a contained GR, and those in the LN are more accessible than in lung. The contribution of antigen to immunity follows a Michaelis-Menten function, where K A is the amount of Ag required to achieve half maximum immune activity: (12) ( 13) τ(t) describes the fraction of optimal immune function in the body over time. In the long term, this function declines with age. It may also change in response to other events, such as HIV co-infection (which reduces the number of CD4+ T cells) or immunosuppressive therapies. Total numbers of nave T cells decline by 5% per year in humans [47], so we assume Mtb-specific T cells available in circulation decline at the same rate.

Uncertainty and sensitivity analysis
There are limited time course data on granulomas that could be used to calibrate model parameters. Thus, we estimated biologically reasonable ranges for each parameter by performing uncertainty analysis and examining the dynamics of different types of GRs to see if their numbers and trends are roughly within the range as observed in experiments [25,9]. We use Latin Hypercube Sampling (LHS) as it efficiently covers the high dimensional parameter space. Different regions of parameter space correspond to different dynamics, reflecting various disease trajectories in human. In our study, we define scenarios where the number of disseminating GR remains low (<0.1) as latency, and these where the disseminating GR quickly increase as reactivation. Among different possible combinations of parameters that produce latency dynamics, we picked a one as our baseline.
We also use sensitivity analysis to study how each model mechanism (via parameters) are correlated with different model outputs [31]. The ranges used for each parameter are shown in Table 3. 5000 parameter sets are generated, and we solve the model system numerically for each of the parameter sets using MATLAB (http://www.mathworks.com/products/ matlab/). After simulations are performed, we calculate the Partial Rank Correlation Coefficients (PRCC) between model outputs of interest (e.g. number of disseminating GRs at time point 10 years in each compartment) and parameters that were varied. PRCCs are tested for significance with significance level of α = 1 × 10 −9 .

Basic reproduction number R 0
In epidemic modeling, R 0 is a metric calculated to determine the outcome of a disease at population level. It can be used as an indicator describing that when a few individuals in a susceptible population are infected with a disease, how likely the disease is going to spread in a population. If the value is smaller than 1, the disease will die out in the long term; however, if its larger than 1 it will spread through the entire population. Similarly, we can calculate an R 0 for our system to see the potential of GR to spread in the lung and LNs.
When the system is at its disease-free equilibrium (S P0 , 0, 0, 0, S L0 , 0, 0), the Jacobian Matrix is calculated as: (15) After decomposition using a method introduced in [13], we get the transmission and transition matrices: From these results, we can calculate the next generation matrix (NGM) with small domain: The larger eigenvalue of K S is: (21) At the disease-free equilibrium, as the adaptive immune response has not yet been activated, the mechanism available to limit infection spread is solely innate immunity, which is represented by μ E and μ L in Equation (21). When adaptive immunity does not occur, TB infection always progresses to active disease. Such observation indicates that the R 0 for an in-host granuloma epidemic is larger than 1.

Latency and reactivation
Although R 0 is larger than 1, the number of granulomas does not necessarily increase when the system is away from the disease-free equilibrium. This is because adaptive immunity is initiated upon antigen encounter. As a result of the model formulation, the system has only one equilibrium, which is the disease-free state. However, there could exist quasi-equilibria for E and I states, where the rates at which healthy sites become infected, and infected sites become damaged are both very slow. We found an example parameter set defined as the latent infection scenario (Table 1, Latent), and the dynamics are shown in Figure 2. The Gong et al. Page 8 initial conditions are listed in Table 2 (the same initial conditions are applied to following simulations, with the initial E P and I P dependent on ϕ P ). In this simulation, we observe that after an initial decline, the number of contained GR increases very slowly. The percentage of disseminating GR is undetectable (below 0.1). In the model, this corresponds to a latently infected individual followed over a long period of time (more than 10 years).
By adjusting parameters, such as the sensitivity of adaptive immunity to antigen stimulation (K A ) ( Table 1, Reactivation), we can simulate different disease outcomes at the 10-year time point, as shown in Figure 3. In this scenario, an individual remains latently infected for about 5 years, until the number of infectious GRs accumulates and the number of immune cell declines, together inducing a higher rate of dissemination. Then the number of both contained and disseminating GRs in the lung and LNs begin to increase rapidly, and we could classify the individual as having active TB (via reactivation). An individual who has active TB is infectious to others. Without treatment, the fraction of functional lung and LN tissue declines and eventually the host will succumb to disease.

Reinfection causes activation of latent disease
In areas with a high prevalence of active TB cases, patients with latent TB can experience reinfection when they contract bacteria from other infected hosts. To study the effects of such events, we tested how reinfection alters disease outcomes by initializing with the same parameter set as in the latent scenario, and then adding 20 new GRs to the system (lung and LN) each year. The result is shown in Figure 4. We can see that every time that a reinfection event occurs, the immune system attempts to control it. Such attempts are quite successful at first; however, we observe higher levels of disseminating GR levels after reinfection. After the fifth reinfection event, the system cannot bring the number of disseminating GRs back under control (<0.1). The dynamics then follow the pattern we see in the reactivation scenario.

Impaired immune function causes activation of latent disease
A host's immune system may not always remain at the same level. Aging can reduce the potential of the immune system, as the thymus shrinks and production of new T cells wanes. In patients with autoimmune diseases such as rheumatoid arthritis, treatment may involve immunosuppression. In South Africa, HIV co-infection with TB is seen in many cases, and the CD4+ T cell population is severely reduced. We tested the impact of an impaired immune system on TB disease outcomes using the baseline parameter set for latent infection (Table 3). We assume that τ(t) (the fraction of T cells available) drops by half at time point 5 years. The result is shown in Figure 5. In this situation, the dynamics immediately switch to a reactivation pattern after the decline in immune function, indicating that latent infection progresses to active disease.

Role of lung granuloma formation
If contained GRs provide a place for bacteria to evade the immune system, exposing bacteria by expediting their transition to uncontained GRs might promote healing. To examine the role of contained GR in the lung, we modified the latent infection parameter set by increasing k 1 (the rate of GR reactivation) from 0.002 to 0.02 day −1 , so that GR stay in a contained state for a shorter period of time ( Table 3). The simulation result is shown in Figure 6. In our simulation, even though the number of contained GRs drops to a slightly lower level initially at the one-year time point, the disease progresses to an active phase with a significantly shortened latent period. This result supports the hypothesis that contained GRs in lungs are favorable to the host by leading to a milder disease outcome rather than preventing the clearance of the bacteria.

Role of granuloma in LN
Granulomas are observed to develop in LNs of humans and NHP models of TB [26]. On the one hand, granulomas present in LNs may provide continual antigen stimulation and facilitate development and maintenance of faster and stronger immune responses. On the other hand, infection within LNs will likely interrupt the immune response machinery by disrupting the physical structure as well as usurping immune cells headed for the lung and detaining them in LNs. We next examined the role of GRs disseminating to LNs. We altered the reactivation scenario parameter set by setting β LP (dissemination from the lung to LNs) to 0 day −1 , so that lung GRs will not seed the LN. The result is shown in Figure 7. As expected, no infection occurs in the LN compartment. The dynamics in lungs are different from what we see in a reactivation scenario (Figure 3). In the beginning, the immune system is not able to contain the infection as successfully as in the reactivation case. We see a larger number of contained GRs, and the number of disseminating GRs is larger than 0.1 from the beginning. However, the disease never progress to the catastrophic end stage as we see in the reactivation case. These results support a dual role of GRs in the LNs. By disseminating to the LN, the increased amount of bacteria material available to the immune system induces a more potent response and better controls infection; they also damage the LN structure and eventually cause the systemic collapse.

Sensitivity analysis
To identify mechanisms that contribute to differential host level disease outcomes, we varied the values of model parameters using uncertainty analysis and assessed the sensitivity of the model output to differences in model parameters. The mechanisms we included and the ranges of the parameter values examined are shown in Table 3 (LHS). We use the number of disseminating GRs as our model readout, as it is associated with uncontrolled growth of Mtb thus usually an indication of TB reactivation. We choose α = 1.0 × 10 −9 as the significance level, and the PRCC values are shown in Figure 8. Mechanisms corresponding to each of these parameters in the model can serve as possible therapeutic targets and should be explored.
We can see that γ P (the rate of infected GRs turning into damaged tissues) has a high and negative correlation with disease, suggesting that one way to control infection is to tolerate a small level of tissue damage as a tradeoff for infection. Indeed, when antibiotics are given to TB patients, they induce fibrosis in the lung, a tissue damage that is irreversible [10,1,12]. As long as the extent of damage is controlled, this option is viable. We can also see that other mechanisms, including β PP (dissemination within the lung), β LP (dissemination from the lung to LNs) and k P (lung GR reactivation) are positively correlated with disease progression. This suggests that efforts should focus on lung GRs, reducing their reactivation and dissemination as opposed to LN GRs. Furthermore, K A , the sensitivity of adaptive immunity to antigen stimulation, is positively correlated with disease. This means that another potential method to containing infection is to reduce the amount of antigen required to mount an adaptive immune response by sensitizing the immune system.

Discussion
We constructed a deterministic model of a "within host" epidemic of TB infection based on dynamics of granulomas in lung and LNs. Granulomas are the key site of infection for Mtb infection, and many form in lungs and LNs in infected hosts over the time course of infection. Our model suggests that even though the R 0 is likely to be larger than 1, we can still achieve latent infection with almost constant levels of contained GRs. There is a new paradigm for TB that suggests that the infection outcomes are not a binary between latent and active TB individuals, but that hosts with latent TB really exist on a spectrum [2]. On the reactivation side of the spectrum, one can more easily move to active disease, for example; however, one is less likely to develop active disease in their lifetime if they fall more to the other side [45]. Perturbations to the host (such as co-infections, aging, etc.) can move an infected individual along the spectrum during their lifetime increasing their risk of active infection. Our findings are consistent with this as they predict that latency is not truly an equilibrium state, but one in flux, even if that change happens over a very long timescale.
Latent infection is sensitive to the responsiveness of the adaptive immunity to antigen. The model shows that reinfection and decline of the immune system trigger reactivation of TB. We analyzed the differing roles of GR in both the lung and LNs. Results indicate that GRs in lungs have a protective function. However, the role of LN granuloma has two facets. On the one hand, they provide the immune system with more antigens, which helps to reduce the disseminating GRs in the lung and prevent the host from becoming infectious. On the other hand, these GRs damage the LN structure and impair immune function, eventually resulting in severe tissue damage in both the lung and LN.
Our sensitivity analysis identified mechanisms that strongly correlate with better host-level outcomes. Figure 8 summarizes those findings, indicating mechanisms that could be potential targets for therapeutics. Healing of disseminating lung GRs induces tissue damage in the lung, as fibrotic lung tissue will not regain its normal function. However, one finding of the model is that by increasing the rate of this mechanism, we can induce low levels of lung tissue damage early on and avoid more severe infection outcomes long term. More obvious mechanisms include inhibition of bacteria dissemination within the lung. These predictions can aid researchers to focus design of possible targets for TB treatment.
The implication of our results is that by sensitizing a patients immune system, we can lower the risk of the reactivation of TB significantly. Since the course of TB antibiotic treatment is long (over 6 months with multiple drugs), and the lack of treatment compliance is a major hurdle in TB treatment, it could be practical to stimulate patient immune systems even after diagnosis, if there is a suitable vaccine. By applying this method to the entire human population, we may keep more infected individuals in latency for a much longer period of time, in effect reducing the population level R 0 .
We can speculate why a tipping point for different disease outcomes exists in our system. There could be two quasi-equilibria besides the disease-free equilibrium. The smaller one is expected to be stable, representing the latent cases. The larger one is likely unstable, and a GR load exceeding this number will further spread infection. Because damage to LNs accumulates and numbers of available immune cells decreases, the latent quasi-equilibrium slowly increases, reflecting a decrease in immune system ability to control infection to a lower level. However, once LN damage reaches a certain threshold, the two quasi-equilibria join and bifurcation occurs. Beyond this point, the disease can no longer be contained by the immune system. This remains to be proven mathematically.
To further analyze this model, we should calculate the two quasi-equilibria, and confirm stability by computing the leading eigenvalues for the next generation matrix at these states. After finding a closed form expression of the tipping point, we can measure the GR load of an infected patient to determine his/her risk of reactivation. By generating a risk distribution of a population, we might link our patient scale model to population scale, and theoretically analyze how population level R 0 is influenced by parameters in our model. For instance, if we can measure the impact of a new drug on a single GR, we would potentially be able to predict its influence in TB epidemic in the entire population.  Quasi-steady state describing latent TB in an individual host. The numbers of susceptible, exposed, infected and dead sites in the lung (panel A) and LN (panel B) are plotted against time. After the initial drop, the number of contained granuloma stays relatively stable over the 10-year simulation period. Panel A (lung): S represents susceptible, E are exposed, I are infected lung tissue and D is dead lung tissue. Panel B (LN): S represents susceptible LN tissue, E are exposed and I are infected LN tissue. Parameters: Table 1, Latent.     Infection dynamics without LN Granuloma formation. Dissemination rate of TB from the lung to LN is set to 0. Panel A shows dynamics in the lung and Panel B shows dynamics in the LN. Panel A (lung): S represents susceptible, E are exposed, I are infected lung tissue and D is dead lung tissue. Panel B (LN): S represents susceptible LN tissue, E are exposed and I are infected LN tissue. Parameters: Table 3, LN GR role. fold of increase of antigen accessibility in disseminating GRs compared to contained GRs; γ P : healing of contained GRs; β LL : dissemination in LNs; K I : adaptive immunity level required to reduce dissemination by half; k L : GR reactivation in LNs; k P : GR reactivation in the lung; β LP : dissemination from the lung to LNs; K A : sensitivity of the immune system to antigen; β PP : dissemination in the lung. These processes are targets for therapeutics.

Parameters Lung GR role LN GR role LHS
β PP (day −1 )