Improving epidemic testing and containment strategies using machine learning

Containment of epidemic outbreaks entails great societal and economic costs. Cost-effective containment strategies rely on efficiently identifying infected individuals, making the best possible use of the available testing resources. Therefore, quickly identifying the optimal testing strategy is of critical importance. Here, we demonstrate that machine learning can be used to identify which individuals are most beneficial to test, automatically and dynamically adapting the testing strategy to the characteristics of the disease outbreak. Specifically, we simulate an outbreak using the archetypal susceptible-infectious-recovered (SIR) model and we use data about the first confirmed cases to train a neural network that learns to make predictions about the rest of the population. Using these prediction, we manage to contain the outbreak more effectively and more quickly than with standard approaches. Furthermore, we demonstrate how this method can be used also when there is a possibility of reinfection (SIRS model) to efficiently eradicate an endemic disease.

Compartmental epidemiological models provide a simple and powerful mathematical framework to capture the main features of a disease outbreak in a population [1,2]. They consider how a disease spreads in a finite population of individuals over a time interval. The individuals are compartimentalized into categories based on their epidemiological condition. The first such model, known as the susceptible-infectious-recovered (SIR) model, was proposed in 1927 by Kermack and McKendrick [3] and is still widely employed today [4]. In the SIR model, there are three categories: susceptible individuals that have never been infected; infectious individuals that are currently infected; and recovered individuals that have previously been infected and are now immunized against the disease. Initially, all individuals are susceptible except for a limited group of infectious individuals, who seed the disease.
In the event of a disease outbreak, it is often desirable to attempt to contain or eradicate it. Different factors influence how effective a containment strategy is, including the characteristics of the disease and of the population [5,6]. However, these characteristics are often difficult to measure or model precisely, especially for novel diseases during their first outbreaks [6][7][8][9][10][11][12][13]. The World Health Organization provides some general guidelines for strategies to prevent disease spread [14], which include travel restrictions, social distancing, and enforced quarantine. In particular, the isolation of potentially infected individuals is often the most effective measure to limit the spread of the infection. The safest approach would be to isolate and quarantine all individuals regardless of their epidemiological condition. However, this cannot be implemented and maintained on a large scale for a prolonged period because of its societal and economic deleterious effects [15].
In order to implement efficient, cost-effective strategies to contain an outbreak, it is therefore critical to promptly identify infectious individuals. The most straightforward approach would be to test all the individuals and im-mediately identify and isolate/treat the infectious ones [16]. In a real-life large-scale epidemic, however, extensive testing is not usually feasible because of economic and logistic constraints [17][18][19]. Therefore, the containment of the disease requires interventions also on individuals who have not been tested yet, which again entails societal and economic costs [20].
Here, we demonstrate that machine learning can be used to identify an optimized test strategy, i.e., which are the individuals that is most beneficial to test. Specifically, we introduce a neural-network-powered strategy [21,22] for testing and isolating individuals, even though the parameters of the model are not known and infectious individuals can be asymptomatic. The neural network informs the decision on which individuals should be tested and isolated. Modelling a disease outbreak using the SIR model [3,4], we demonstrate that, for an equal number of quarantined individuals, the neural-networkinformed strategy manages to contain the disease outbreak more effectively than alternative standard contacttracing strategies, while autonomously and dynamically adapting to the specifics of the outbreak using only the information about the first confirmed cases. Furthermore, since for many diseases immunity is not lasting, we also demonstrate how the neural-network-informed approach can be used to efficiently prevent a new disease from becoming endemic when there is a possibility of reinfection (SIRS model). We envision that similar methods can be employed in public health to control epidemic outbreaks and to eradicate endemic diseases.
FIG. 1. SIR model and containment strategies. a We consider a population of 10 5 individuals moving on a square lattice (320 × 320 cells). Each individual can be either susceptible (S, grey), infectious (I, orange), or recovered (R, black). At each time step, the susceptible individuals become infectious with probability β when they occupy the same cell as an infectious individual, and the infectious individuals recover with probability γ becoming immunized against the disease. b Temperature of individuals that are healthy (S and R, 36.8 ± 1.0) and infectious (I, 37.4 ± 1.2); note the range of asymptomatic individuals, i.e., infectious individuals with temperature in a healthy range. c Disease spread at times t = 20, 50, 100 in the absence of any containment measures and d corresponding fraction of the population in each category; unchecked, the disease spreads to almost all the population. e Disease spread using standard contact tracing to isolate potentially infectious individuals starting at t = 20 (dashed vertical line) and f corresponding fraction of the population in each category; the disease spreads more slowly than in a, but is not contained. g Disease spread when a total lockdown is implemented at t = 20 (dashed vertical line) preventing any further spread of the disease and h corresponding fraction of the population in each category.
where the population consists of N = 10 5 individuals distributed uniformly on a square lattice with 320 × 320 cells, resulting in an average density of 0.98. The individuals move as random walkers on the lattice [24,25] being each confined to a region with an average radius of r = 10 cells [26]. All their positions are updated simultaneously at each time step. Each individual always belongs to one of the SIR categories (Figure 1a). At the beginning of the simulation, 50 individuals (0.05% of total population) are randomly selected and made infectious (I). The rest of the population, instead, is initialized as susceptible (S). The disease is transmitted with probability β when susceptible and infected individuals are occupying the same cell, to mimic the short-range interactions necessary for disease spreading. An infected individual has a probability γ of recovering in each time step, after which it becomes immunized against the disease. Each individual is also characterized by a "temperature", which slightly increases as the disease develops; the temperature is normally distributed and corresponds to 36.8 ± 1.0 for healthy (i.e., susceptible and recovered) individuals, and to 37.4 ± 1.2 for infectious individuals (Figure 1b), so that there is a significant overlap between the two distributions and, thus, some individuals can be "asymptomatic". We let the model evolve for 150 time steps, which can be thought of as the days of an epidemic outbreak that lasts approximately six months, but can easily be rescaled to fit another time scale. Figure 1c provides an example of the free evolution of the outbreak in the absence of any containment measures. By t = 20, the disease has spread from the initial infectious individuals creating a few hotspots. These hotspots steadily grow (t = 50) until most of the population has been infected (t = 100) and the outbreak starts to subside. Figure 1d shows how the fraction of individuals in each category varies over time: as the disease spreads, the number of susceptible individuals steadily decreases and the number of recovered ones increases, while the number of infectious individuals initially grows and then slowly decreases until the outbreak ends because essentially the whole population is immunized.
The spread of the disease can be controlled by enacting containment measures. For example, Figures 1e and  1f show the evolution of the outbreak when potentially infectious individuals are isolated based on standard contact tracing [18,19,27,28] (see details in Methods "Contact tracing"). At each time step, a fixed number of tests (N test = 100 N ) are performed to assess whether individuals are infectious. The value of N test is set low enough to simulate a limited access to testing so that only a small portion of the population can be tested (15% in 150 time steps). The individuals to be tested are selected randomly from the susceptible individuals with the highest temperature, i.e., those that show more clear symptoms. Selecting the individuals to be tested in this way presents two advantages compared to a purely random testing strategy: it avoids a slow start (with an initial probability of success around 1/2000), and it is more representative of reality (where symptomatic cases first indicate an outbreak). For simplicity, we assume that the test never fails and that there is no delay between performing the test and receiving the result. However, we remark that the task of identifying the infectious individuals is made harder by the fact that some of their temperatures are in the healthy range (Figure 1b), making them asymptomatic. The individuals who test positive are quarantined: from that time step on, they neither move nor interact with the rest of the population. For the tested individuals, the isolation is temporary, so the system knows when they stop being infectious and can safely return to interact with the rest of the population.
Due to the limited number of tests, quarantining only the individuals that test positive is not enough to contain the outbreak. It is therefore necessary to use contact tracing to isolate also individuals who have not been tested. (While testing starts from the first time step, contact tracing and isolation of individuals starts only at t = 20.) For all detected infectious individuals, we trace back their previous contacts up to 50 time steps in the past. Within this group of individuals that interacted with confirmed cases, we test those with the highest temperature. We rank the other individuals according to their number of contacts with infectious individuals, and, given the same number of contacts, according to their current temperature. We isolate a number of individuals until reaching a predetermined fraction of the population (here, 25%). (See details in Methods "Contact tracing".) It is interesting to compare the free evolution of the outbreak (Figures 1c-d) with the case with isolation based on contact tracing (Figures 1e-f). While at t = 20 both outbreaks are similar, the containment measures take hold almost immediately, significantly reducing the size of the outbreaks and the fraction of individuals that are infectious at the same time. The epidemic outbreak remains confined to a few areas reaching only a part of the population (Figure 1e) and the curve of infected individuals is flatter ( Figure 1f). We remark that, despite its success in slowing down the spread rate of the disease, also the strategy relying on isolation of potentially infectious individuals identified by contact tracing does not lead to a complete suppression of the outbreak, as can be seen from the fact that nearly 20% of the population is infectious still at t = 150.
Complete eradication of the disease is in principle possible by adopting an unrealistic total lockdown, where the whole population is quarantined simultaneously (Figures 1g-h). From t = 20, all individuals are isolated so that they cannot move or interact. Figure 1g shows how this leads to an almost immediate containment of the disease hotspots. More interestingly, Figure 1h

Neural-network-informed testing
It would be desirable to achieve disease eradication as in the total-lockdown strategy (see Figures 1g-h), but isolating only part of the population as in the contacttracing strategy (see Figures 1e-f). To achieve this, we propose a strategy that employs a neural network to inform which individuals to test and isolate.
The schematic of the neural network we employ is shown in Figure 2a (see details in Methods "Neural network"). In general, a neural network receives some inputs, elaborates them through of a series of hidden layers of artificial neurons, and returns an output [29]. In our case, the input consists of contact-tracing information for a given individual n for the last 10 time steps. Specifically, we provide the neural network with five time series: The first three indicate the number of tested infectious individuals within a distance r = 4, 8, and 16 cells from the considered one. C tot n (t) is the total number of contacts (i.e., defined as individuals occupying the same cell at the same time) and C i n (t) is the number of contacts with confirmed infectious individuals. Then, the neural network elaborates this information through three dense layers of artificial neurons. Finally, the neural network outputs a value p, representing the risk of being infectious at the current time step, between 0 for a putatively healthy individual and 1 for a putatively infectious individual. Individuals with p > 0.995 are immediately isolated, while individuals with p ∈ [0.5, 0.995] are slated to be tested, starting from the individuals with the highest temperatures until the depletion of all available tests. In this way, we manage to freeze the infectious individuals that are easy to identify, while optimizing the deployment of the available tests: we use the tests principally to achieve a better understanding of the extent and distribution of the disease.
Neural networks are supervised machine learning methods and, therefore, require training [29]. In general, the training of a neural network is performed by providing the neural networks with a series of inputs and corresponding known outputs [29]. In our case, we can only use for training individuals that have already been tested within each run of the simulation (see details in Methods "Neural-network training"). Therefore, we start training at t = 20, when we have tested 2000 individuals. In subsequent time steps, the size and accuracy of the training data set increases with the number of performed tests, so we repeatedly retrain the neural network to im-prove its performance. This leads to a positive feedback loop, where a better-trained neural network selects more efficiently individuals for testing, which in turn provides better insights into the disease distribution, which finally improves the training data set available to further improve the performance of the neural network. Figure 2b depicts the snapshots of the system at t = 20, 50, 100. The color code is the same as that used in Figure 1, with the addition of frozen individuals (F) indicated in light blue. Until t = 20, the outbreak evolves freely, analogously to Figure 1c, while enough data are accumulated to train the neural network. From t = 20 and onward, the neural-network predictions are used to inform which individuals to isolate and test. By t = 50, all outbreaks have been identified and surrounded by frozen individuals. Subsequently (t = 100), the outbreaks remain under control and are prevented from spreading, in stark contrast with the wide spread of the disease in free evolution (t = 100 in Figure 1c).
The orange solid line in Figure 2c shows the fraction of the population that is infectious as a function of time. Shortly after we switch on the neural network (t = 20), the infectious fraction reaches its maximum (5.1% at t = 26) and subsequently rapidly decreases to zero. Correspondingly, the number of recovered (black solid line) and susceptible (gray solid line) individuals reach a plateau. In particular, the fraction of individuals that are infected and eventually recover is 8 ± 4%.
The number of frozen individuals is initially zero and quickly increases in the first stages of neural-networkinformed testing, eventually reaching the set value of 25% of the total population. We can compare the curve of the infectious individuals using the neural-network-informed testing and isolation (orange solid line) with the limiting cases of free evolution (orange dotted line, cfr. Figure 1c) and of total lockdown (orange dashed line, cfr. Figure 1g). By isolating only 25% of the population, the neural-network-informed strategy achieves a containment of the epidemic similar to that achieved by the full lockdown. Figure 2d represents the fraction of new infectious individuals per time step for the neural-network-informed strategy (orange line) and for the free evolution of the epidemics (black line). The free-evolution curve reaches a maximum at t = 59 corresponding to δI(59) = 1.4 ± 0.2%. The curve for the neural-network-informed strategy starts decreasing immediately after isolation starts at t = 20, corresponding to a peak value δI(20) = 0.55 ± 0.08%, and stably reaches zero around t = 50.
Figures 2e-g provide comparisons with a standard contact-tracing strategy, where the same number of individuals are tested and isolated as described in detail in the previous section. Figure 2e shows snapshots of the system at t = 20, 50, 100: starting from the same number of hotspots (t = 20), contact tracing manages to identify all regions reached by the disease (t = 50), but the disease can still spread due to the limited number of individuals that can be isolated (t = 100). Fig-FIG. 2. Improved outbreak containment using neural-network-informed testing. a Structure of the neural network. The inputs (table on the left) are R4,n(t), R8,n(t), R16,n(t), C i n (t), and C i n (t)/C tot n (t), where Rr,n is the number of confirmed infectious individuals in a radius r from the individual n, C i n is the number of contacts that individual n has had with confirmed infectious individuals, and C tot n the total number of contacts for individual n; for each parameter, the input includes the history during 10 time steps ([t − 9, t]). The neural network analyzes these inputs through three dense layers and outputs a value p from 0 (individual predicted to be healthy) to 1 (individual predicted to be infectious): individuals with p > 0.995 are directly quarantined, and individuals with p ∈ [0.5, 0.995] are tested starting from individuals with the highest temperatures until the depletion of the available tests. b-d Disease evolution when the testing and isolation strategy is determined based on the output from a neural network: b Snapshots of susceptibles (S, grey), infectious (I, orange), recovered (R, black), and frozen (F, blue) individuals at time steps t = 20, 50, 100. c Corresponding fraction of the population in each category compared with the two limiting cases of free evolution (dotted orange line, see also ure 2f shows that, differently from the case of the neuralnetwork-informed strategy (Figure 2c), the increase of the fraction of infected individuals slows down for some time steps, but then starts again to grow reaching a peak at t = 120 corresponding to about 20% of the total population. The total number that have been infected at the end of the simulation (i.e., all infectious and recovered individuals at t = 150) is strikingly lower for the neural-network-informed strategy (6% to 14%) than for the contact-tracing-based strategy (30% to 89%). The wide shaded area in Figure 2f is nearly 7 times larger than in Figure 2c, showing that the contact tracing is less stable against different evolution patterns of an epidemic with same underlying SIR parameters. The orange line in Figure 2g shows the fraction of new infectious individuals δI as a function of time, which is non-zero at the end of the simulation, unlike for the neural-networkinformed strategy (orange line in Figure 2d). We can therefore conclude that contact tracing is less effective than the neural network for the same number of frozen individuals.

Automatic and dynamic adaptation to the outbreak characteristics
An important characteristic of the neural-networkinformed strategy is that it can automatically and dynamically adapt itself to the underlying characteristics of the outbreak. In our model, this means that the neural network does not need to have explicit knowledge of the underlying SIR model. More generally, the neural network can adapt to other kinds of outbreaks and also take into account the effects of the containment measure put in place.  Figure 3c). We then apply NN B , i.e., another neural network trained on a different outbreak whose underlying SIR model has a slighlty different transmission rate (SIR B , β B = 0.8 and γ B = 0.03). The resulting performance can be seen in the gray lines in Figures 3a-c. While overall NN B manages to improve the outbreak with underlying SIR A model compared to its free evolution, it performs much worse that NN A . At the end of the simulation in Figures 3a, the fraction of infectious individuals is still in the range (0.12% to 13.7%) of the population for the gray confidence bands, while the overall fraction of individuals in isolation is in the range (30% to 72%), as shown in Figures 3b. This suggests that, thanks to its training using the information acquired by the testing during the first 20 time steps, the neural-network-informed strategy gets fine-tuned to the specific characteristics of the underlying outbreak.
We further validate the fine-tuning of the neural network by training NN B on the testing data obtained from the outbreak with underlying model SIR B . The colored lines in Figures 3d-f show the results of applying NN B on the SIR B outbreak, which demonstrate a good containment of the outbreak. Instead, the gray lines show what happens when using NN A , which leads to a much worse outcome. In this scenario, the peak for the curve of infected is around t = 84 and 25.7% against 8.1% of the population for the training performed on SIR B . Figures 3f shows that δI oscillates between 540 and 995 new cases per time step in the interval t ∈ [20, 73] before de-creasing.

Disease eradication with possibility of reinfection
We now consider the case when the immunity against the disease is not permanent [30][31][32]. Thus, we consider a SIRS model (Figure 4a), which is an extension of the SIR model where recovered individuals have a probability ρ at each time step to become again susceptible [30,31] (see details in Methods "SIRS model"). In the absence of any containment measures, the possibility of reinfection leads to an endemization of the disease. Figure 4b shows such free evolution of the disease: from the initial hotspots (t = 20), the disease spreads quickly to a large portion of the population (t = 50) until reaching a steady state. Figure 4c shows how the fraction of individuals in each category varies over time: during the initial spread of the disease, the number of susceptible individuals steadily decreases and the number of infectious ones increases; once the disease reaches its steady state, the fraction of infectious individuals stabilizes to a value that depends on the characteristics of the SIRS model, i.e., on the value of its parameters β, γ and ρ. Therefore, the disease becomes endemic [1]. Figure 4d-e show the development of the disease when a standard contact-tracing-based containment strategy is implemented, like that employed in Figures 1e-f. The solid lines represents the averages for susceptibles (S, gray), infectious (I, orange), recovered (R, black) and frozen (F, blue) individuals throughout the simulation. The color bands, which denote the 90% confidence interval, is larger than those in Figure 2f; this implies that the performance of the contact-tracing strategy can vary significantly depending on the specific outbreak. It can be seen that this containment approach manages to reduce the number of infectious individuals in the steady state of the disease, but not to eradicate the disease itself.
Finally, Figure 4f-g show the performance of the neural-network-informed strategy. We employ the same approach and neural network architecture shown in Figure 2a and the same strategy that we employed to contain the outbreaks in the SIR model shown in Figures 2b-d. Briefly, we start testing individuals from the beginning of the simulation accumulating data to train the neural network. From t = 20, we start training the neural network to predict infectious individuals and use this information to decide which individuals to isolate and test. The neural-network-informed strategy manages to eradicate the disease, as can be seen from the fact that the fraction of infectious individuals approaches zero by the end of the simulation (orange solid line in Figure 4g), while the number of susceptible individuals increases as recovered individuals gradually lose their immunity. Therefore, by employing the neural-network-informed strategy, it is possible to prevent the initial outbreak from leading to the endemization of the disease. Overall, these results show that the neural network gets automatically optimized for the parameters of the underlying outbreak. In all cases, the shaded areas represent the 90% confidence intervals obtained from 100 simulations.

DISCUSSION
The current outbreak of the novel coronavirus disease (COVID-19) [7,[33][34][35][36] has dramatically brought to worldwide attention the crucial importance of epidemiological models for choosing the best strategies and policies to contain disease outbreaks [6,7,9,13,20]. Machine-learning approaches have been already proposed to help disease diagnosis [37] and epidemics handling [13]. In fact, in the last few years, various neural-network architectures have been employed to manage human diseases [38][39][40][41], such as malaria [42], and animal diseases, such as in swine flu [43]. In this work, we have now shown how a neural-network-informed strategy can improve the containment of an epidemic, even when only a small number of specific tests is available and some of the individuals are asymptomatic. This improvement can be seen in three key aspects. First, integrating the neural network into the outbreak handling improves the performance of contact tracing, while performing the same number of tests and isolating the same fraction of individuals. Second, the neural network autonomously tunes its weights to the ongoing outbreak, without needing to explicitly know its underlying model or its parameters, and therefore does not require a priori knowledge of the disease outbreak characteristics. Third, since the neural network is regularly retrained as new data become available, it can automatically and dynamically adapt itself to the evolution of the outbreak as well as to the changes in the behavior of the population, e.g., due to containment measures or different social habits. As a striking example, we have shown that, in the case of temporary immunization, the neural-network-informed strategy can prevent a disease outbreak from becoming endemic.
Even though we used a SIR model to describe the dynamics underlying the disease, the neural network will automatically adapt itself to different underlying dynamics described by more complex epidemiological models, which might include, e.g., the disease incubation time [9], delays in the testing process [19], or even different patterns of movement of the individuals (e.g., periodic motion, and long-range travel) [10]. It is also possible to provide the neural network demographic information (e.g., individual risk factors, such as age, employment, and preexisting conditions) as well as spatial information (e.g., the location of the individuals, differentiating various places of aggregation, such us hospitals, markets, and schools), or even simple access medical tests (e.g. and g corresponding fraction of the population in each category; in this case, the disease is completely eradicated. The dashed lines in e and g correspond to the free evolution of the disease and are reported from c for comparison. In all cases, the solid lines indicate the average over multiple runs (Nruns = 100), while the shaded areas correspond to 90% confidence intervals.
cough recordings [44]). Furthermore, the neural-networkinformed approach presented in this work can be generalized to other situations, such as fire prevention [45] or econometrics [46].

SIR model
We divide the population of N = 10 5 identical individuals into three epidemiological categories: susceptible individuals S, infectious individuals I, and recovered individuals R, as in the original SIR model [3]. The individuals move on a square lattice with side l = 320 according to a stochastic model [24,47]. The position of each indi-vidual n ∈ [1, N ] at each time step t ∈ [0, 150] is given by its coordinates x n (t) = [x n (t), y n (t)]. Each individual is an independent random walker confined to move within a small area of the lattice centered around its initial random position x n (0) = [x n (0), y n (0)]. At each time step, it can move in its Moore neighborhood [48] according to the following displacements: with probability 1 3 +1 with probability 1 3 + k[y n (t) − y n (0)] (2) where k = 0.04 determines the radius r k ≈ 10 cells within which each individual moves. The positions of all individuals are updated synchronously and independently from each other.
The spread of the infection occurs because when a susceptible individual occupies the same cell as an infectious individual, it becomes infectious with probability β in each time step. The transmission applies only for the infectious individuals that are not frozen. Each infectious individual becomes recovered with probability γ at each time step. The parameters used are β = 0.6 and γ = 0.03, except for Figure 3, where we also employ β = 0.8.
Each individual is also characterized by a "temperature", which is normally distributed and corresponds to 36.8 ± 1.0 for healthy (i.e., susceptible and recovered) individuals, and to 37.4 ± 1.2 for infectious individuals, with a great overlap between the two distributions (Figure 1b).

SIRS model
The SIRS is an alternative to the SIR model that assumes the immunization to the disease is temporary. Therefore, recovered individuals lose immunization and return susceptible with probability ρ in each time step. We employ ρ = 0.02.

Contact tracing
We keep track of individuals that occupy the same cell at a certain time step by introducing the contact matrix: where δ is the Kronecker delta, which has value 1 if the pair of individuals n and m occupy the same cell at time t, and 0 otherwise. Thus, the total number of contacts for individual n for the 50 time steps before time t is The number of contacts with confirmed infectious individuals is where δ i m (t) is 1 if individual m has already been tested and found positive at time t, and 0 otherwise. When implementing the lockdown strategy based on contact tracing, we list the agents in descending order as a function of C i n (t), and we sort those with equal value based on their temperature. At each time step, we select for testing the first N test = 100 individuals in this list. We use the rest of such list for selecting individuals to freeze, whose number is set to match that of the neural-networkinformed strategy. In this way, we can compare the two approaches using the same number of tests and the same number of frozen individuals. When the target number of individuals to isolate is larger than the individuals in the contact list (e.g., at the beginning of the simulation when the number of confirmed cases is small), we build an additional list from where to select the remaining individuals, which includes individuals that never had direct interactions with confirmed cases, but have been within a radius of 8 cells in the last 50 time steps; we sort also this additional list based on the temperature of the individuals.

Neural network
We employ a dense neural network with three hidden layers with 16 neurons each and ReLU activation function [49,50]. The output layer has one single neuron with a softmax activation function returning a value p ∈ [0, 1]. Additionally, we use dropouts for the hidden layers as a way to avoid overfitting [51] (dropout rate 0.2, so that in each training epoch only 80% of the neurons is activated).
The input to the neural network at time t includes R 4,n (t), R 8,n (t), R 16,n (t), C i n (t), and C i n /C tot n (t) for time steps [t − 9, t], where C i n (t) and C tot n (t) are the number of infectious and total contacts (Eqs. 4 and 5), and R r,n (t) is the number of individuals that have tested positive within a radius r: where the summation is over all infected individuals. The training of the neural network is performed using information relative to the individuals that have already been tested (which is split between a training set and a validation set [52]). The loss function is the mean square error, we use the stochastic gradient descent method implemented in the Adam optimizer [53,54], and the number of training epochs is fixed to 100 (see Supplementary Figure S1). While we use only two labels for the training (0 for susceptible individuals and 1 for infectious individuals), the trained network returns a prediction that is a continuous value p ∈ [0, 1].
Using the prediction of the network, we split the individuals that have not been tested yet into three groups: (1) p > 0.995: individuals with a high chance of being infectious, who are frozen without testing. (2) 0.5 < p < 0.995: individuals with a medium chance of being infectious, amongst which the N test = 100 individuals with the highest temperature are tested. (3) p < 0.5: individuals with a low chance of infection.
We implement the neural network using the Python libraries Tensorflow and Keras [55]. ACKNOWLEDGEMENT We acknowledge support from the MSCA-ITN-ETN project ActiveMatter sponsored by the European Commission (Horizon 2020, Project Number 812780).