Epidemic Process over the Commute Network in a Metropolitan Area

An understanding of epidemiological dynamics is important for prevention and control of epidemic outbreaks. However, previous studies tend to focus only on specific areas, indicating that application to another area or intervention strategy requires a similar time-consuming simulation. Here, we study the epidemic dynamics of the disease-spread over a commute network, using the Tokyo metropolitan area as an example, in an attempt to elucidate the general properties of epidemic spread over a commute network that could be used for a prediction in any metropolitan area. The model is formulated on the basis of a metapopulation network in which local populations are interconnected by actual commuter flows in the Tokyo metropolitan area and the spread of infection is simulated by an individual-based model. We find that the probability of a global epidemic as well as the final epidemic sizes in both global and local populations, the timing of the epidemic peak, and the time at which the epidemic reaches a local population are mainly determined by the joint distribution of the local population sizes connected by the commuter flows, but are insensitive to geographical or topological structure of the network. Moreover, there is a strong relation between the population size and the time that the epidemic reaches this local population and we are able to determine the reason for this relation as well as its dependence on the commute network structure and epidemic parameters. This study shows that the model based on the connection between the population size classes is sufficient to predict both global and local epidemic dynamics in metropolitan area. Moreover, the clear relation of the time taken by the epidemic to reach each local population can be used as a novel measure for intervention; this enables efficient intervention strategies in each local population prior to the actual arrival.

Numerical burden to calculate the PSCM is relatively small compare to IBM simulation, thence it seems reasonable to use the actual population data of Tokyo metropolitan area, then to use the sample data of UTC. However, when each population size is scaled-up to the actual size than the contact rate should be scaled-down simultaneously with the same ratio to get a realistic value of basic reproduction ratio. Therefore, the results of both stochastic and deterministic PSCM are exactly the same for both data sets (actual population and UTC sample). For the sake of easier comparison between the IBM simulation and PSCM analysis, we have utilized the same UTC sample data for PSCM.

B stochastic PSCM
Here, we describe how we calculate the probability that a global epidemic occurs in the Tokyo metropolitan area using a branching process [2] associated with our population size class model (PSCM). For this purpose, a global epidemic is defined as the case in which the infection never dies out in the branching process (i.e., extinction of disease did not occur). Therefore, we first calculate the probability of disease extinction for an initially infected individual living in a home population of size class n (with representative size K n ) and commuting to a work population of size class m (with representative size K m ). The probability of a global epidemic is then simply given by subtracting the extinction probability from 1.
We begin by defining the probability that an individual will commute between a given pair of home and work population size classes. As defined earlier, the number of individuals who live in a home population of size class and commute to a work population of size class is L nm ( Figure 2D in the main manuscript). The probability that an individual living in a home population of size class n commutes to a work population of size class m is given by ϕ W (m|n) = L nm / ∑ m ′ L nm ′ . In the same vein, the probability that an individual commuting to a work population of size class m lives in a home population of size class n is given by ϕ H (n|m) = L nm / ∑ n ′ L n ′ m . The single primary infectious individual belongs to a commuter population of the (n 0 , m 0 )-th size class. Therefore, the expected numbers of secondary infections in the home and work populations are given by the basic reproductive ratio as R H n0 = βK n0 /γ and R W m0 = βK m0 /γ, respectively [3,4]. We assume complete mixing of each local population and that all local populations are initially consisted only of susceptible individuals. Susceptible hosts can be infected only by sharing either a home or a work population with an infectious host. Therefore, the number of secondarily infected hosts appearing in a non-initially infected home (work) population of size class n (m) is given by R W m0 ϕ H (n|m 0 ) (R H n0 ϕ W (n 0 |m)), and the expected number of secondarily infected hosts appearing in a commuter population of the (n, m)-th size class is given by Here, δ nm denotes the Kronecker delta (i.e., δ nm = 1 if n = m and δ nm = 0 if n ̸ = m). R n0m0 (n, m) for a commuter population of the (n, m)-th size class may be non-zero only when either the home or work population is shared with commuters in the initially infected population (i.e., either n = n 0 or m = m 0 ). Assuming that the number of secondary infections from an infectious host follows a Poisson distribution, the probability that there will be k secondarily infected hosts in a universally susceptible commuter population of the (n, m) size class as Next, we use the branching process to calculate the extinction probability of the infection. Let Q t (n 0 , m 0 ) represent the probability that all of the infectious commuters originating from a single initially infectious individual in a commuter population of the (n 0 , m 0 )-th size class will go extinct within t infection cycles.
This event is equivalent to the probability that all of the infectious commuters driven from k secondarily infected hosts in any possible commuter population of the (n, m)-th size class will go extinct within t − 1 infection cycles: Note that because this is a branching process focusing on the extinction probability of the infectious descendants of a single infected individual, subsequent infection from the secondarily infected hosts is assumed to occur mutually independently. Equation (3) can be used as a recurrence formula to calculate Q t (n 0 , m 0 ) from Q 0 (n 0 , m 0 ) = 0. The last equality holds because the extinction probability within 0 generations of infection must be zero. By iterating Equation (3), we obtain Q t (n 0 , m 0 ) for an arbitrary value of t until it converges to a fixed-point value for t → ∞. The fixed-point value can be obtained by solving the implicit relationship which gives the probability that the infection will eventually be extinct in the branching process. Because a global epidemic is defined as an exclusive event of disease extinction, the probability of a global epidemic can then be calculated as With reference to equation (1), above equation (4) can further be simplified as follows.
The first factor in the last equality can be interpreted as an equation to give the probability of initial extinction within the initial home and work populations (i.e., invasion of only the initial home and work populations and not the entire commute population) and the last 2 factors as a correction term for it. Neglecting this correction term allows the equation for the probability of a global epidemic to be approximated as This simply gives the probability of an epidemic in a set of initial home and work populations with a combined local population size of K n0 + K m0 .

C deterministic PSCM (1) Epidemic dynamics among the commuter population size classes
Here, we describe our attempt to derive various properties of the epidemic dynamics of the commute network in the Tokyo metropolitan area by considering a deterministic system of difference equations for a commuter population of the (n, m)-th size class. Individuals in a commuter population of the (n, m)-th size class are classified by disease state as susceptible, infectious, or removed, allowing the total number of individuals in the (n, m) class, L nm , to be decomposed as The first terms of equation (7) and equation (8)  Here, we assume that the entire population initially consisted only of susceptible individuals except for the single initially infected host. Using these initial conditions, we solved equations (7)-(9) to obtain the final size of the global epidemic (the fraction of the total number of individuals who acquired the infection during the entire epidemic period), the peak time of the epidemic (the time until the total number of infected individuals attains its peak), the final size of each local epidemic, and the arrival time of the epidemic in each local population. In addition to these numerical results, we also used equations equations (7)-(9) to obtain several analytical results concerning the final size of the epidemic and the arrival time of the epidemic in each local population, as follows.

(2) Final size of the epidemic in each local population
The final size of the epidemic in each local population is defined as the fraction of individuals who have ever experience infection during the epidemic period. We denote the total numbers of susceptible, infectious, and removed individuals in the n-th home population size class as Equation (7) and equation (9) are then combined to obtain Where t = T ∆t, we substituted equation (12) in equation (11) with t = ∞ and using and combining this with equation (10), we have a set of equations to determine Ψ nm : This system of equations can be solved numerically by recursively inserting Ψ nm from the right side as Ψ nm on the left side, starting from Ψ nm = 1, until the result converges to a fixed point. Needless to say, the above values also give the final size of the epidemic in each local population within the specified population size class. Moreover, the final size of the global epidemic can also be calculated from Ψ nm as

(3) Arrival time of the epidemic in each local population
The arrival time of the epidemic in each local population is defined as the time until the infected individual first appears in the local population since the epidemic has started. This can be calculated from the time course of the number of infected individuals (i.e., y H n (t) and y W m (t)) in each population size class as follows.
To define the time t H n at which the first infected individual appears in a home population of size class n, we first note that y H n (t) is for the total number of infected individuals in such a size class. As the total number of hosts in the home population size class n is L H n , and as the representative population size of class n is K n , there are L H n /K n such populations. Therefore, the number of infected hosts at time t in each local home population of size class n is given by y H n /(L H n /K n ), and t H n and, similarly, t W m , are defined as To obtain the approximate formula (including the logarithmic dependence mentioned in the main text) for the arrival times we linearize the system of equations (7)-(9) as where the (n ′ , m ′ )-th eigenvalue is denoted as ρ n ′ m ′ and the (n, m)-th element of the corresponding right eigenvector as v . Because the right eigenvector and the left eigenvector form a biorthogonal set, the expansion coefficient c n ′ m ′ is given from the inner product between the (n ′ , m ′ )-th left eigenvector and the initial vector y nm (0) as Here, the (n, m)-th element of the (n ′ , m ′ )-th right eigenvector is denoted by u (n ′ ,m ′ ) nm and the initial condition in which the initial infected individual is only in a commuter population of the (n 0 m 0 )-th size class is used. Up to now, equation (18) and equation (19) give the formal solution for the linearized system of equation (17). As a second approximation step, we will assume exponential growth of the infected populations at a rate given by the largest real eigenvalue ρ. The left and right eigenvectors corresponding to the largest real eigenvalues are denoted as u nm and v nm , respectively. Because the linearized coefficient matrix is a non-negative matrix (i.e., the Perron-Frobenius theorem is applicable), the dominant eigenvalue is purely real; moreover, the elements of the corresponding left and right eigenvectors are also purely real. This means that at the long time limit, the contribution from the eigenstate with eigenvalue ρ would exceed the other eigenstates in the expansion of equation (18). Under this exponential growth approximation, the elements of the corresponding right eigenvector gives the relative ratio between the populations in the exponential growth phase and the elements of the corresponding left eigenvector gives the greproductive value,h which represents the contribution from each population to the exponential growth. Using this exponential growth approximation, it is possible to approximate the expansion of equation (18) as From this, we calculate y H n and y W m and then insert the results into the definition of the arrival time of the epidemic in equation (16) to obtain the arrival times of the epidemic as follows where v H n = ∑ m v nm and v W m = ∑ n v nm . The logarithmic population size dependence appears in the third factor of equation (21) (i.e., ln K n and ln K m ). However, the first factor of equation (21) contains a population size class dependence. In addition, both L H n (L W m ) and v H n (v W m ) should also have depend on the population size class. However, the calculated numerical results show that the population size class dependence of the first factor is relatively small (i.e., the dependence is somehow cancelled out in the ratio L H n /v H n (L W m /v W m )) relative to that of the third factor. Therefore, the clear logarithmic population size dependence in the arrival time of the epidemic in each local population that we observed in the IBM must have originated from the third factor of equation (21). Although the actual calculation of equation (21) requires a numerical eigenvalue calculation, it gives the formal explicit solution, which is very effective for examining the epidemic parameter dependence.