Mycobacterium Tuberculosis growth model based on agents: proposal of a tool to aid in decision-making to “In Vitro” experiments

Even treatable and preventable with medication, tuberculosis (TB) continues to infect and cause deaths globally, especially in the poorest countries and in most vulnerable areas of the rich countries. Given this situation, the study of the growth curve of Mycobacterium tuberculosis, which causes tuberculosis, can be a strong ally against TB, helping in the development of new drugs or for the development of the theory. This paper describes a study of the Mycobacterium tuberculosis growth curve and it aims to test hypotheses on data obtained by agent-based simulation with the population mean observed in an experiment in vitro. The tested hypotheses compare the mean of experiments for the rst hour of the day. The tests were performed using the Student distribution. We believe that our approach, that use agent-based model, could reproduce Mycobacterium tuberculosis growth curves.


Introduction
Tuberculosis (TB) is a major public health problem, a ecting predominantly low and middle-income countries, developing among immigrants, poorest and vulnerable parts of high-income countries (Lönnroth et al.;. According to Burgos and Pym (2002) Mycobacterium tuberculosis, which causes tuberculosis, is one of the most successful bacterial pathogens in the humanity history.
A report published in 2015 by the World Health Organization (WHO; 2015) estimates that in the year 2014 were 9.6 million new cases of tuberculosis (TB) and 1.5 million deaths, and together HIV virus, tuberculosis presents one of the biggest causes of deaths from infectious diseases.
Therefore, the study of Mycobacterium tuberculosis growth curve becomes extremely important, because this study will test behavior hypotheses in cases of environmental stress (Voskuil et al.;2004), verify bacillus drug reactions and help to develop new ones (Andries et al.;. For these objectives, growth curve are important not only to determine its dynamic but also sampling bacteria in di erent phase of growth to determine its behavior variation during the growth. However, the tuberculosis bacillus has a very slow rate of population growth, form clumps to grow and requires enrichment medium. Because this behavior, in vitro experiments are very costly and need the maximum of tools in order to rational design of the studies involving the growth curve and sampling of the bacteria. Multi-agent systems, a eld of arti cial intelligence, enables, by means of their tools, to simulate behavioral rules of a system computationally. For Garcia and Sichman (2005), "Agents are computer characters that act according to the program set, directly or indirectly, by a user. They can act alone or in communities, trainees multi-agent systems ".
Many measurable phenomena present in nature have probability distributions similar to some probabilistic models. Often, these models are used to represent the probability density function of random variables. Probabilistic models are useful in many real situations, to make the variable predictions study and assist in decision support. It is believed that the main variables that model the Mycobacterium tuberculosis bacillus growth curve also resemble a probabilistic model.
The main goal of this work is to model the tuberculosis growth curve, using agent-based simulations, where the values of some variables are drawn from probability distributions, thus making the system developed more similar to real systems. Finally, to test the similarity of the obtained growth curves, we performed hypothesis tests. The purpose of these tests is to provide a methodology for verifying whether sample data provide evidence supporting a formulated statistical hypothesis or not. The test was used to test the null hypothesis that the mean of the populations used to generate samples by agent-based simulation is equal to the population mean obtained in in vitro experiment, we used t test.
The paper is structured in 6 sections. In Sections 2 and 3, respectively, there are a theoretical basis for the work and some studies on bacterial growth. In section 4 is presented the proposal of the work: a modeling of the growth curve for Mycobcterium tuberculosis using agents. The Section 5 presents the graphical and numerical comparison using in vitro experiment data for comparison. Finally, in Section 6, we present the main contributions and the further works 2 Background Literature

Growth Curve Bacterial
When a bacterium is inoculated in a medium that contains all the nutrients necessary for its survival, the bacteria tend to duplicate. Initially, it adjusts to the new environment (lag phase) until it can begin the process of division regularly (exponential phase/ log phase). When growth become limited, the cells stop dividing (stationary phase), until nally they die for the saturation of environment (Todar;2013). These phases as shown in Fig. 1.   Figure 1: Growth curve and its phases (Todar;2013) • Latency phase: Immediately after inoculation, the population temporarily remains unchanged. Although there is no apparent cell division, the bacteria can be grown in mass or volume, synthesizing enzymes, proteins and RNA; • Exponential phase: After the adaptation, they rapid growth and consume the maximum amount of nutrients from the environment, also in this stage release wastes and chemical signaling molecules in the environment; • Stationary phase: After increasing population growth, the bacteria begin to decrease due to saturation of available nutrients, waste accumulation in the environment and the lack of space. In this stage, they enter dormancy stage where saving energy to increase the time survival; • Decrease phase (death): The last phase of the curve is the death, where they begin to die for lack of nutrients.

Growth
Curve to Mycobacterium Tuberculosis Von Groll et al. (2010) standardizes a method for determining the growth curve based on a system that obtains the bacterial growth by monitoring the liquid medium by means of an oxygen sensor which emits uorescence. Growth curves were generated by the MGIT (Mycobacteria Growth Indicator Tube). The data are for the strains that originate in di erent geographical regions and they are resistant to di erent drugs, according to Tab. 1. For each of the strains was made one experiment with the same solubility. The monitoring was conducted for 25 days and every time the equipment bacterial growth medium, the growth expressed as growth units (GU).
However, this method does not allow to know how long the bacteria remained in the stationary phase or the rate of population decline in the death phase. This is because the measurement comes from the metabolic activity of the bacteria, and therefore can only be monitored if there is population growth.
In this paper, we have as hypotheses that, for each time, the simulated model can be useful to represent the model of Von Groll et al. (2010), i.e., the null hypothesis is that the population mean of simulated model is equal to the value observed in the in vitro tests. To do these analyses, we have used just one in vitro experiment, because this procedure has a high cost and demands a long temp to be completed. In this way, it is not possible to do more observations.

Agent-based systems
The conventional simulation is one of the more viable tools available to project, plan, control and evaluate new alternatives/changes to real world. To use the computational simulation, we need software's to represent the functions of the real world (Rebonatto;2000).
According to Azevedo and Menezes (2006), the use of computational simulations is very appropriate to describe real systems, because the simulation tries to reproduce a real situation arti cially, where hypothesis could be verify without risks.
One type of simulation is the agent-based simulation. In this type of simulation is possible recreate a population of a real system, where each individual of this population is represented by an agent and each agent has a set of speci c rules to de ne its behavior, its interactions with other agents and the environment where it is inserted (Colella et al.;. In agent-based systems, a real phenomenon is decomposed in a set of elements and their interactions. Each element is modeled as an agent and a general model is the results of all interactions between the agents. Strack (1984) apud Adamatti (2011) reports that the simulation could be divided in three steps: i. Modelling Step: build the phenomenon model; ii. Experiment Step: apply variation in the built model, changing parameters that in uence in the resolution process; iii. Validation Step: compare the simulated data with real data, to analyses the results.
To simulate computationally a real problem, it is necessary a rigorous study to be able to abstract all variables and relationships that de ne the model. This study is usually done through observation and analysis of the real phenomenon.
Capturing all of the components of the simulation model is not an easy task and the higher the number of variables and more detailed the model to be simulated, the greater the computational work.
After de ning the model and done the simulations, the results are compared with those observed in natural phenomena, in order to evaluate their equivalence, i.e., the similarity with the reality.
The choice of agent-based simulation to model of Mycobacterium tuberculosis growth curve was done because the need for integration of di erent behaviors among the agents, the dynamism that this kind of modeling can enable, such as the interaction between agents, the interaction between agents and environment, and the exibility to do modi cations and extensions in the model.

Related Works
With computational advances it became possible to analyze social and economic systems through simulation studies, including the agent-based modeling. Terano (2011) says that the simulation method based on agents is very important, since it can produce results without assumptions unlike conventional approaches.
Since most of the time it is not feasible to experimentally test all possible hypotheses, modeling and simulation can reduce laboratory timeconsuming work and assist in property research. In this way, a wide variety of agent-based structures is currently available. Each one of these provides a di erent set of features in order to allow e cient simulation of certain types of systems. Gopalakrishnan et al. (2013) used an agent-based model to investigate the role of potential bacterial virulence in surgical site infections (SSI). For this author, the dynamic representation of knowledge through computational modeling and simulation can increase traditional research studies, generating and instantiating new hypotheses, integrating information and lling gaps in the current knowledge base. In his work the author creates an environment that reproduces the dynamics of healing in surgeries (Muscle Wound agent-based model -MWABM). The developed environment simulates the interaction between the cellular and molecular mechanisms of wound healing and contaminating bacteria. To study SSI, the simulated bacteria were added to the MWABM base. The rules that govern many types of cellular agents were created, according the literature. The execution of MWABM involved repeated iterations ("stages") where the computational agents interact with other agents and their environment. MWABM was used to identify threshold zones that marked the phase transition between healing and non-healing and abscess formation, with a speci c emphasis on the di erence between healing, the presence of avirulent bacteria and the presence of bacteria with virulence potential.
Until now, the focus of agent-based approaches to the study of bacterial population growth has to develop models that can accurately replicate known results and understand how these are a ected by di erent behaviors within a bacterial population (Gorochowski et al.;2012). One of the rst studies in this area was developed by (Kreft et al.;1998), where they created a simulator based on individuals to represent the bacterial growth dynamics, the Bacsim. The authors created agents with generic behavioral rules for bacterial cultures, with the intention that the model presented serve as a basis for the study of growth in di erent bacterial colonies. The agents present in the model absorb nutrients and thereby gain cell mass as well as produce waste in the environment. When they reach a certain threshold, agents reproduce (mimicking the process of cell division). Each agent spends a xed amount of energy in order to maintain its metabolic activities, and if the nutrient intake is less than necessary for this maintenance, the agent will lose cell mass. After a certain threshold, the agent dies, and its cell mass is reintegrated into the environment with a percentage of utilization as part of the substrate.
Other work that used agent-based simulation to recreate bacterial growth was that of . The author describes a model to growth curves of Mycobacterium tuberculosis based multiagent systems. For the author, the study of this bacterium is very important as it allows the study of characteristics and the development of new drugs. The performance of experimental tests with this bacillus are slow, taking at least three weeks to show some result, and often fail because of contamination or dehydration means.
In the model of , agents that represent bacteria have di erences characteristics, and these are extremely important for agents to represent their roles in the environment and to interacting with it, similarly as to Mycobacterium tuberculosis bacteria would interact in their natural living environment. The author reports that the results were very satisfactory and the curves found reached a very close similarity to the real curves. Finally, the author also points out that the model is useful as it enables the testing of hypotheses in some hours in opposition to those carried out in vitro which would take days.

Growth Model Curve Of Mycobacterium Tuberculosis
When the growth curves are obtained by MGIT, the information that these curves present are the product of several factors of population dynamics. Therefore, it is not possible to extract isolated information, such as: how much they consume or how much fail to consume after reaching environmental saturation, or the proportion of signaling molecules are required to enact the saturation.
Considering these circumstances, it is necessary to infer how many are the variables that a ect the population growth, using just the observation of the results.
To simulate the population dynamics we used Netlogo (Wilensky; 1999) programming environment. The agents based model implemented simulates an environment where agents represent Mycobacterium tuberculosis.
The model agents have speci c rules of behavior, which are modeled as variables of the agents. These rules are essential for them to represent their role in the environment and how they interact.
The simulation has the time division: the tick. Every tick, the agents perform one or more actions. These actions are modeled by functions set out in the model, as follows: feed, continue signals and reproduce.
As many measurable phenomena present in our everyday lives tend to be distributed according to some probabilistic theoretical models, there is a possibility that the main variables that growth curve of Mycobacterium tuberculosis model can also be distributed as normal model. How many random variables biological t a normal distribution (Callegari-Jacques; 2003) , i.e., the central values are more frequent and extreme rarer (very low values as infrequent as very high). It was assumed that the model variables also are distributed normally.
Agents are inserted in an environment that is shared by all. In this environment, they nd the nutrients needed to survive and deposit the waste from their metabolism.
Each space in the environment is called patch. In each patch, there is a number of nutrients and waste. Nutrients are used by agents throughout the simulation to keep their vital functions active and accumulate energy. The waste is deposited by the agents after the metabolization of nutrients.
As in vitro experiments, that are performed in the laboratory, where a number of bacteria are inoculated into a container, the model initialization is done with a number of agents in the environment.
Agents receive di erent values for each variable. The set of possible values have di erent probabilities of occurrence, a characteristic of normal distribution. When an agent is generated, its variables receive simulated values of the distribution used. Fig. 2 shows the agent's life cycle, clarifying the actions and decisions they must take every tick.
After the start of the simulation, the agents begin to move in the environment to search for nutrients. However, in the beginning, they only main is survival, which they are adapting to the environment, and therefore unable to reproduce. This same behavior is observed in vitro experiments that the bacteria need a certain time to adjust their metabolic functions to the new habitat.
Later this adaptation time, the agent starts to perform normally all its functions, including reproduction. It consumes nutrients and then transform them into energy, which will be deposited in its reservation to indicate how healthy it is and to keep active vital functions.
Each agent of the model has a di erent time to reproduce. This time indicates the amount of energy that the agent should have available in its energy reservation to perform the reproduction function. In this way, the energy is a limiting factor for reproduction.
There is an amount of energy to maintain the agent alive at each cycle. If it consume less than necessary to maintain its metabolism, it will start spending power of its reserve in order to survive and the end of this reserve will take its death.
When there is accumulation of waste in patches, the agents have di culty absorbing nutrients, because it becomes more di cult to agents survive in a very saturated environment.
Another aspect in the curve modeling is the bacteria sensor. This variable determines how many agents in the environment will reach the saturation situation. When this threshold is reached, the agents release a signal molecule, called quorum sensing, that warn others that the environment is full. According to Whitehead et al. (2001), this process occurs when there is awareness of high cell density, allowing the entire population initiate an action, once the critical concentration was achieved.
The main action in the proposed growth model is the decision to reduce consumption. Once the agent perceives the situation of saturation, and release a signal molecule, it enters reduced power state, which consumes less nutrients, and it generates less waste and reproduce less.
Reduced consumption is a boolean variable that can only receive two values, true or false. Once the real consumption is reduced, the bacteria reduces the quantity of nutrients that will absorb the environment by tick and also decreases proportionally to the amount of energy required to keep it alive.
In the model initialization, all agents have false value for reduced consumption and it just gets true value when the agent detects a signi cant number of agents or signals in the environment. This information is stored into the variables, sensor signals or sensor bacteria.
Reducing consumption of nutrients and energy to maintain vital functions agent aim to make it grow less, and so can survive longer in the environment.

Model Implementation
After de ning the population dynamics, the next step was to implement the model in NetLogo (Wilensky;1999). Through user intervention, it is possible con gure some environmental parameters, such as the initial number of agents and the amount of available nutrients. Fig. 3 shows the model interface created and its variables. The results obtained by Von Groll et al. (2010) represent the growth curves of Mycobacterium tuberculosis in only two stages, the adaptation and the exponential phase. In order to facilitate comparisons between real growth and simulated, a key has been created that enables the user to simulate curves with the same stages of growth, i.e., it is detected that the curve is not growing, the simulation is terminated. See more details about this model in Moraes et al. (2017).

Values interpretation
The curves from the MGIT express the result in units growth (GU). Therefore, an arbitrary measure of equivalence becomes necessary GU relates to the number of simulation agents. In the model developed was used an agent to represent every two GUs.
The growth curves of Fig. 4 were obtained by Von Groll et al. (2010), using MGIT, with growth in GUs and time in days. However, in the gure, growth was expressed in number of agents. The equivalence between GUs and agents was 20.000 GUs for 10.000 agents.
Using the same idea, an equivalence was also obtained for the growing time. As the experimental curve expresses the results by the day, and as seen in the simulation proposed model is temporal division ticks, the equivalence of 260 ticks for each day simulation was used.

Results
This section presents the results obtained after the implementation of the proposed model. Initially, it presents the curves generated by the model with the real data (in vitro). In a second step, the validation showing some data and hypothesis tests. Fig. 5 shows the results generated for all strains. Generated values were obtained by the averaging of 50 simulations for each of the strains (continuous line). The gray shaded area represents the standard deviation.
The standard deviation is an indicator of greater exibility with respect to possible curves to be generated, noting that the higher standard deviation the lowest the robustness level (stability) of the model. The graphs of Fig. 5 are composed from two growth curves: the dotted line represents the real curve data and continuous curve is the simulated data curve.

Model Validation
The proposed model has been tested in order to be validated. The graphic results and numeric results presented in the previous section were compared with real data obtained by 2010). To do the comparison it was necessary to convert the simulated data. The ticks were converted to days and the agents number for Growth Units. Remember that each day is equals to 260 ticks, and each agent represents two growth units.
Various parameters settings were tested. These adjustments allowed the behavior of the simulated growth curves to be reproduced with a relatively similarity to the real growth curves.
By analyzing the obtained data, it is clear that some of them represent better the beginning of growth; others the growth medium; and most of them can represent faithfully the end of growth. In this way, we cannot conclude that got better representation viewing only graphics.
To do a comparison more precise between the real growth curves and the simulated curves, we did some hypothesis tests using Student's test (t-test).
The tests were executed in R software, through the t.test function, at a signi cance level of 1%. The level of signi cance is the maximum allowed probability for the test statistic to fall in the critical region when the null hypothesis is actually true. The function returns the value of the t-statistic, the number of degrees of freedom, the corresponding p-value, and the con dence interval. The p-value is the probability of obtaining a test statistic value that is, at least, as extreme as that representing the sample data, assuming that the null hypothesis is true. The t-tests were performed for every rst hour of the day from the tenth day. From the rst to nine days, it was not possible to perform them due to the fact that no experiment showed population growth, which characterizes a similar behavior in these 9 rst days, considering the adaptation phase, presented in in vitro experiments.
Tables 2, 3, 4 and 5 show the null hypotheses (µ in GU's) evaluated for each of these days and the respective values for the sample mean (without GU's), sample standard deviation (without GU's), test statistic, p-value and con dence interval (in GU's) of 99%.
For GC 02-2761 and GC 03-0850 strains, from the 10th to the 13th day, p-values lower than the level and signi cance were adopted (0.01), indicating an evidence contrary to the null hypothesis, since these results suggest that the observed result is very uncommon to happen when the null hypothesis is true. On the tenth day, for the GC 02-2761 strain, the result of the sample mean was lower than the population mean indicated in the null hypothesis, the same did not happen on the 11th, 12th and 13th days. From day 14, however, GC 02-2761 and GC 03-0850 strains have p-value results higher than the level of signi cance adopted, so for those days there is no evidence that contradicts the hypothesis Null.
The GC 03-2922 strain began its growth one day after the others. In this way, the analyzes were performed from the 11th day. The p-values results were higher than the level of signi cance from the 15th to the 20th day, and therefore, there is no evidence to contradict the null hypothesis for those days. In the 16th day, the p-value was very close to the level of signi cance (0.01).
For H37Rv strain, from the 10th to the 13th, and from the 15th to the 17th day, evidence was found that contradicts the null hypothesis. For the 14th and the 18th to the 21st day there is no evidence to contradict the null hypothesis, since the p-value is greater than the level of signi cance. However, it is important to highlight to the fact that the p-value found for the 14th day was very close to the level of signi cance adopted (0.01).

Conclusion and Further Works
Tuberculosis is one of the oldest diseases, with wide geographical distribution, constituting a serious public health problem worldwide. The study of growth curve of Mycobacterium tuberculosis, which causes tuberculosis, enables the understanding of various behaviors of bacilli, such as response to di erent chemical agents and environment conditions. In spite of the behavior of bacillus face to di erent conditions still no predict by in silicum models. Modeling growth curve is a tool allows better designing studies involving growth curve from normal condition of growth and maximizing the analyses.
This work showes development of a growth curve model of Mycobacterium tuberculosis, able to reproduce the real curves given as input. The proposal was inspired by the work of , which created a model based on multi-agent capable of representing the real curves of Mycobacterium tuberculosis with a certain degree of similarity (Von 2010), which has standardized a method to measure growth population bacillus when placed in a medium. The  results obtained by Von Groll et al. (2010) enabled to verify the model with more delity to reproduce the real curves. The proposed growth curve was modeled using simulation based agents. Wooldridge (2009) says that multi-agent systems are a powerful and exible tool for modeling environmental/social systems, because this type of system allows to analyze the behavior of each individual, rather than an average behavior of them.
To make the proposed model more similar to real growth model, we proposed that possible values of the variables were stochastically distributed, thus, treating the individuality of each agent. We have as hypotheses estimate the best possible probability density function of the variables, with the nal product an identical curve, or minimum error when compared to real data.
In order to compare data from agent-based simulation with data obtained from an in vitro experiment, hypothesis tests were performed, because we believed that the developed model could reproduce Mycobacterium tuberculosis growth curves similar to those obtained in an in vitro experiment. In order to do this veri cation, Student's t-test was used, where the hypothesis to be tested was that the average for each day should not be di erent from the average obtained in the in vitro experiment.
The agent-based model presented satisfactory results for most days, and those that do not contradict the null hypothesis are considered satisfactory. In our analysis, satisfactory results means that were found on almost 80% of the days for strains GC 02-2761 and GC 03-0850 and 75% of days for strains GC 03-2922 and H37Rv. Also, most of the unsatisfactory results occurred in the rst 4 days analyzed (0th to 13th or 11th to 14th).
Considering the results found in the statistical tests carried out, it is believed to have obtained su cient numerical arguments for the use of the agent-based model developed to simulate growth curves of Mycobacterium tuberculosis. Therefore, it can be very useful for checking hypotheses and aiding real experiments.
In this work, we have used data about just one real experiment. However, it is important, when possible, to do more studies with real samples and to de ne a bigger number of observations to realize the comparison, to maximize the results.
In the proposal of this work, we simulate the growth of Mycobacterium tuberculosis considering normal growth conditions. Future studies are intended to introduce into the model situations that occur "in vitro" experiments, such as the use of antibiotics and the bacillus resistance at di erent concentrations.