Who Should Get Vaccinated? Individualized Allocation of Vaccines Over SIR Network

How to allocate vaccines over heterogeneous individuals is one of the important policy decisions in pandemic times. This paper develops a procedure to estimate an individualized vaccine allocation policy under limited supply, exploiting social network data containing individual demographic characteristics and health status. We model spillover effects of the vaccines based on a Heterogeneous-Interacted-SIR network model and estimate an individualized vaccine allocation policy by maximizing an estimated social welfare (public health) criterion incorporating the spillovers. While this optimization problem is generally an NP-hard integer optimization problem, we show that the SIR structure leads to a submodular objective function, and provide a computationally attractive greedy algorithm for approximating a solution that has theoretical performance guarantee. Moreover, we characterise a finite sample welfare regret bound and examine how its uniform convergence rate depends on the complexity and riskiness of social network. In the simulation, we illustrate the importance of considering spillovers by comparing our method with targeting without network information.


Introduction
Allocation of a resource over individuals who interact within a social network is an important task in many fields, such as economics, medicine, education, and engineering (Lee et al. (2020), Banerjee et al. (2013), among others).One of the important policy decisions of this sort in pandemic times is how to allocate vaccines over heterogeneous individuals to control the spread of disease and protect the lives of vulnerable.It is crucial for the vaccine allocation rule to take into account the spillover effect of cutting transmission of the disease.
Since the start of COVID-19 pandemic, governments around the world have gone to great lengths to collect network data in which one can trace who is contacting whom.Motivated by these observations, we study how to estimate optimal individualized allocations of vaccines under capacity constraint, using micro-level social network data.Data is informative about the covariates of N units, their health status, and their associated neighbors.Using insample information, we evaluate the risk to each unit, calculated from its own covariates and spillovers from its heterogeneous neighbors, using an individualized Susceptible-Infectious-Recovered model.The purpose of vaccine allocation is to maximize public health, by selecting units to be vaccinated.Obtaining an optimal assignment is, however, challenging since whether a treatment is optimal for an individual depends on which treatments are given to her neighbors.This implies that the search for an optimal allocation has to be performed over the entire network jointly, not individually.This paper makes two main contributions.The first contribution is to develop methods to estimate vaccine assignment policies that exploit network information at the micro-level.
The second contribution is to show that the empirical welfare criterion built upon the SIR spillover structure delivers a submodular objective function, which we exploit to obtain computationally attractive algorithms to solve the welfare optimization problem.Distinct from the existing approach of estimating individualized allocation policies under network interference (Viviano, 2019;Ananth, 2020), our setting does not assume the availability of Randomized Control Trial (RCT) data.Instead, we assume the availability of estimated values of these spillover parameters from other sources, which we plug into our SIR model.Exploiting already estimated SIR parameter values for immediate targeting and allocation is useful when time is of the essence and the need for policy action is pressing, and avoids the cost of running an RCT.
To optimize the empirical welfare of allocation policies, one naive approach is to evaluate the value of empirical welfare exhaustively for all possible combinations of vaccine allocations over individuals.We refer to this as the brute-force approach.Although the brute-force approach is guaranteed to optimize the empirical welfare, it is not practicable since the number of possible combinations grows exponentially as the number of individuals in the network increases.On the other hand, giving up on optimization entirely and implementing random allocation is indeed practicable, but leads to a significant waste of the vaccine supply, which we show in our simulation exercises.
Given the challenge in optimizing the empirical welfare, what we recommend in this paper is an allocation policy obtained by greedy optimization.A greedy optimization algorithm in the current setting is to sequentially allocate a vaccine to an individual in the network who is most influential for improving the social welfare.In general, greedy algorithms are not guaranteed to yield an optimum.With the current welfare criterion built upon the SIR spillover structure, however, we can obtain a non-decreasing submodular objective function.Relying on the seminal result in discrete convex analysis shown by Nemhauser et al. (1978), we show that the greedy algorithm delivers an allocation policy at which the value of the objective function is worse than the optimum only up to a universal constant factor, independent of the spillovers, size, and density of the SIR networks.Our derivation of the population welfare regret of the greedily estimated allocation policy reflects the potential loss of welfare due to non-feasibility of obtaining the brute-force allocation policy.
We further illustrate the advantages of our method in our simulation exercises.In a small network setting (up to 35 individuals in the network), comparisons with the bruteforce allocation rules reveal that our proposed greedy allocation rules leads to an optimal solution.In a large network setting, we evaluate the performance of our method versus two different assignment rules : random assignment, and targeting without considering network information.The welfare improvement relative to these two baselines ranges over 4% -12%, and this result is insensitive to the values of SIR parameters and the size and density of network.
To assess how uncertainty in the SIR parameter estimates affect the welfare performance of the estimated policy, we derive a uniform upper bound of the welfare regret of our vaccine allocation rule and its convergence rate with respect to the size of the sample used for obtaining the SIR parameter estimates.The uniform upper bound of regret depends upon two things.Firstly, n, which is the sample size of the separate dataset used to estimate the SIR parameters.Secondly, the ratio of the network data sample size N to the maximum number of neighbors N M plus the minimum between the number of infected units N I and the number of available vaccine doses d (i.e., (d min{N M , d} + 2dN M + min{N I , d})/N ).As N M and N I grow, the complexity and risk of the social network increase, which can reduce the welfare regret performance of the estimated vaccine allocation rule.
The remainder of this paper is organized as follows.We first discuss the relevant literature in the rest of this section.Section 2 details various models, and the HI-SIR model in particular, and the wider setting.Section 3 is concerned with estimation, including the estimation of SIR parameters and the construction of the QIP problem.The optimization procedure is contained in section 4. Section 5 contains the theoretical results.Simulation details are shown in Section 6, and Section 7 concludes.All proofs and derivations are shown in the appendix.

Related Literature
Our work contributes to the literature on statistical treatment rules, which was first introduced into econometrics by Manski (2004).The optimal treatment allocation regime has been studied in many fields, such as medical statistics (Zhao et al., 2012(Zhao et al., , 2015)), operational research (Loiola et al., 2007) and economics.Following the pioneering works of Hannan (1957) and Savage (1951),1 researchers in econometrics and machine learning often use regret to evaluate the performance of decision rules.The recent literature of statistical treatment rules includes Dehejia (2005), Hirano and Porter (2009), Stoye (2009, 2012), Tetenov (2012), Bhattacharya and Dupas (2012), Kitagawa and Tetenov (2018), Zhou et al. (2018), Manski (2019), Kasy and Sautmann (2019), Athey and Wager (2020), Kock et al. (2020), Mbakop and Tabord-Meehan (2021), Manski and Tetenov (2021), Sakaguchi (2021), and Kitagawa et al. (2021) among others.The planner's objective function in the majority of these works is a sum of individual outcomes under the no-interference assumption (i.e., Stable Unit Treatment Value Assumption of Rubin (1974)).This assumption does not hold in this study because of the network spillover effects that are present.To our knowledge, there are only two other papers that also consider the network setting in statistical treatment choice, which are Viviano (2019) and Ananth (2020).These two papers assume the availability of pilot data from RCT studies performed over networks in order to form empirical welfare criteria.Their frameworks are not restricted to the SIR setting of the current paper and cover spillover structures commonly assumed in social science applications.In contrast, our approach forms welfare estimates by imposing the HI-SIR model structure and plugging in values of the primitive spillover parameters that are estimated or calibrated in some external study (e.g., Baqaee et al. (2020)).Another notable difference is that we consider allocation policies that are not constrained other than via the capacity constraint, while Viviano (2019) and Ananth (2020) assume the class of implementable allocation policies has a finite VC-dimension to control overfitting to the training RCT sample.
The SIR model was originally proposed by Kermack and McKendrick (1927), and is now the workhorse model in the epidemiological literature.Many extended versions have been studied in epidemiological analyses, such as the Susceptible-Infected-Susceptible model (Nåsell, 1996) and the Susceptible-Exposed-Infected-Recovered model (Li and Muldowney, 1995).During the global pandemic, an epidemological literature has sprung up within economics.Atkeson (2020) and Stock (2020) introduced the SIR model into economics to study the implications of the current pandemic on the US economy.We introduce heterogeneity into the SIR model, which is similar to what Acemoglu et al. (2020) does in studying the Multi-Risk SIR model.That paper assumes, however, that the infection rate after the release of a vaccine equals zero, which means it does not consider the vaccine allocation problem.Our work contributes to the current literature by studying micro-level vaccine as-signment rules in a heterogeneous SIR model with network information.In contrast, the existing works analyzing vaccine allocation rules focus on solving for the optimal proportion of vaccinated units in the population (Pastor-Satorras and Vespignani (2002), Manski (2010Manski ( , 2017))).Chen et al. (2020) analyzes vaccine allocation using a heterogeneous SIR model, while they consider vaccine allocation policies at the group-level rather than the individuallevel.
We build a connection to the literature on using a submodular function to solve an optimization problem.The performance guarantee of a general greedy algorithm for solving submodular maximization problems with a cardinality constraint was first established by Nemhauser et al. (1978).The later literature links the cardinality constraint to a more general constraint : Matroid constraint (Fisher et al., 1978;Cunningham, 1985).See Bach (2011) and Krause and Golovin (2014) for overviews of papers studying optimization of submodular functions.In this work, we discuss a submodular function with a uniform matroid constraint (i.e., capacity constraint) and a more general partition matroid constraint.
We notice that our approach to vaccine allocation problem is related to the influence maximization problem first formulated by Kempe et al. (2003).Chen et al. (2010) investigates submodularity of objective functions and greedy optimization algorithms in this problem.Applications of the influence maximization problem include targeting for viral marketing ( (Domingos and Richardson, 2001)) and optimal information spread in social network ( (Bakshy et al., 2011)).There are two widely studied information diffusion models in this literature: Independent Cascade Model (Goldenberg et al., 2001) and Linear Threshold Model (Granovetter, 1978).Despite some similarity between the diffusion models and our SIR model, this literature has not considered individualized vaccine allocation problem.
We also note that there is a growing literature on estimation of treatment effects under network interference.Manski (2013) discusses identification of treatment effects and spillover effects under a deterministic interference graph and a set of relevant potential outcomes.The increased number of network datasets that have recently become available has motivated further work on this topic, including Sävje et al. (2017), Aronow et al. (2017), Athey et al. (2018), Basse et al. (2019), and Leung and Moon (2019).Li and Wager (2020) non-parametrically estimates direct and indirect effects of treatment in a random network setting.Vazquez-Bare (2020) analyzes estimation of spillover effects using an instrument variable.See Kline and Tamer (2020) and Graham and De Paula (2020) for recent reviews on econometric analysis in the presence of social interactions.

Setup and Identification
We consider a basic model to study the vaccination allocation problem.Let us first introduce the timeline and data setting that we consider in this work.
As shown in the illustration, we suppose there are two periods.At t = 0, policymakers initially observe the network structure A (i.e., adjacency matrix) linking N individuals, for which we provide further details below.Policymakers then observe covariates X i ∈ X ⊂ R dx and current period health state H 0 ∈ {S, I, R} for each of the N individuals.The health states {S, I, R} stand for Susceptible, Infected, and Recovered.We assume the network structure A is observed before personal health status to avoid the impact of self-isolation on the network structure.At t = 1, policymakers start to assign the vaccine.After a short vaccination period, people begin to meet their neighbors, which we call the interaction period.The health state during that period is defined as H 1 ∈ {S, I, R, D}, where D stands for death.
Since at the time of assigning vaccination, H 1 is not yet observed by researchers, a stochastic health state will be used to evaluate personal risk.The ultimate goal of policymakers is to maximize the expected social health situation via the allocation of vaccines.
In our setting, units are connected through a social network.We assume the following property on network structure holds : Assumption 2.1.(Undirected Relationships) The interference graph is undirected.i.e., A ij = A ji .
The symmetric N ×N adjacency matrix A specifies who contacts with whom, with the (i, j)th element of A, denoted by A ij , equal to one if unit i and unit j has positive contact time, and zero otherwise.By convention, all the diagonal elements A ii are equal to zero.If A ij = 1, then we say that i and j are neighbors.Let N i indicate the neighbors of unit i, then we write A ij = 1 if j ∈ N i and i ∈ N j .The size of spillover (i.e., the probability of disease transmission) between the units i and j depends not only on A ij but also on the amount of their contact time and the transmission rates which are allowed to be asymmetric between them.We accordingly have a directed weighted network structure for the spillovers, as shown in later sections.Now, let us introduce the notation that we use in the following sections.First, v i is the individual vaccine assignment rule (i.e., v i = 1 if unit i gets the vaccine).Let v denote (v 1 , ..., v N ) ∈ {0, 1} N , and X denote (X 1 , ..., X N ) ∈ R N ×dx .Let S i be the susceptible state indicator in the first period (i.e., S i = 1 {H 0i =S} ), let I i be the infected state indicator in the first period (i.e., I i = 1 {H 0i =I} ), and let R i be the recovered state indicator in the first period (i.e., R i = 1 {H 0i =R} ).Moreover, let |N i | denote the number of neighbors of unit i

Heterogeneous-Interacted-SIR model
To measure the personalized transition probability, we use a HI-SIR model.Our model is defined in discrete time within a simplified setting of two time periods.In the first period, we observe the health state of each unit H 0 , which belongs to S(Susceptible), I(Infected), or R(Recovered), In the second period, the state variable is H 1 .Compared with H 0 , H 1 includes one more state D(Death). (2) Without the vaccine, the state can move from susceptible to infected, then to either recovery or death.Now, we consider the setting after introducing the vaccine.Generally, vaccination has two purposes : the first is limiting the spread of disease, and the second is treatment.
Vaccination builds up the immune system, which leads to recovery.However, the effectiveness of vaccination (i.e., the percentage of vaccinated units that recover) is not clear.For simplicity, we assume that assumption 2.2 holds.
To further simplify the setting, we split all units into a finite number of disjoint groups based on their characteristics.The infection rate between each group varies.This setting could be extended to the individual level, but the micro level infection rate would need to be known in this case.Here, we consider two groups and use age as a binary indicator : G 1 (Young) and G 2 (Old).We now define a i and b i as the group indicators (i.e., We specify one of the key components in SIR models, the infection rate of unit i, as : where β sk = −κ s ln(1 − c sk ), c sk is the probability of successful disease transmission following a contact between group s and group k (i.e., c 11 measures the transmission probability from one unit to another within the young group, c 12 is the corresponding probability of transmission from a unit in the old group to a unit in the young group, with similar definitions for c 21 and c 22 ), and κ s is the average number of contacts in group s at each time period.β sk describes the effective contact rate of the disease between group s and group k.
The derivation of equation 3 can be found in the appendix.
In the above expression, I j (1 − v j ) means a susceptible individual can only be infected by neighbors who were infected and not vaccinated.2Those neighbors may come from various groups.We calculate the fraction of neighbors in each group and multiply them by the associated risk parameters.The risk parameter β sk measures the probability that a susceptible individual in group k is infected by an infected individual from group s in one time period.
We now define {γ 1 , γ 2 } as the recovery rate and {δ 1 , δ 2 } as the mortality from infection in group 1 and group 2 respectively.Given this, we can formulate the probability of staying in the infection state for the infected unit i as : Since the probability of recovery and death purely depend on personal physical fitness,3 there is no interactive part in equation 4. The transition probability to the infected state is then : In the above expression, the probability of an unvaccinated unit being infected has two components.The first is the probability of a healthy unit being infected.The second is the probability of staying in the infected state for those infected in the first period.Under Assumption 2.2, a vaccinated unit has zero probability of being infected.Similarly, the transition probability to the susceptible state is : An unvaccinated unit can only exit the susceptible state by infection.Therefore, the probability of staying in the susceptible state decreases with the risk parameter β sk , which depends on the number of infected neighbors and the number of contacts with them.The remaining two states do not rely on the network structure.First, the transition probability to the recovered state : In the above expression, 4 recovery has two different sources.One is the vaccine, and the other is self-immunity.The effect of self-immunity is heterogeneous and varies with personal characteristics.The probability of building immunity in each group is γ 1 , γ 2 .The last state is death, which occurs with probability

Optimal Vaccine Allocation Problem
In Emanuel et al. (2020), a group of medical ethics experts suggest a successful vaccine is needed to reduce death and morbidity from infection, and is also needed for the restoration of economic and social activity.Following that suggestion, we choose our baseline outcome variable as the weighted average of the probability of being healthy in the second period.
The idea of using weighted probability is to allow a flexible policy target of the planner.For example, if the planner wants to incorporate the importance of economic recovery into the policy objective, she may want to weight more the probabilities of being healthy of those who can contribute more to the economic output.For instance, the planner could specify the weights on the individuals to depend on their individuals characteristics including working hours and other socioeconomic characteristics (i.e., g i = g(X i )). 5 We assume the weight is non-negative for every unit.Taking these into consideration, equation ( 9) specifies the goal of the vaccine allocation policy as a constrained optimization problem: where and d ≥ 1 is a positive integer for the exogenous cardinality constraint.The main idea of the above objective function is to maximize the weighted probability of being in the susceptible or recovered state in the second period by appropriately assigning the d doses of vaccine at the end of the first period.
In equation ( 9), P hi is the heterogeneous state transition function, which describes the probability of h ∈ {S, R} in the second period.This transition probability depends on the individuals' covariates and previous state including whether being vaccinated or not, and the associated network structure.We adopt the HI-SIR model to formulate the above transition function, which has been provided in the previous subsection.
One relevant question is : Will vaccine allocation change the network structure?Yes, it would change the behaviour of vaccinated units.For example, vaccinated units prefer to go out as compared to unvaccinated units.Given this, the number of contacts at each time period κ s and the network structure A would change after the vaccine allocation.Our framework allows the network structure to vary without affecting the optimal allocation of vaccines in a special case where only the vaccinated units change their behaviours.This is because under our perfect treatment assumption, the vaccinated units no longer spread the disease or be infected, and their behavioral changes do not affect the health statuses of the neighbors and themselves.On the other hand, our framework cannot accommodate a general case where the unvaccinated units also change their behaviours, since if so the heterogeneous SIR parameters in the objective function change in response to the vaccine allocation.To allow this scenario, we could incorporate uncertainty as to the values of κ s and A in the second period, for instance, by optimizing an objective function that takes the expectation of the SIR parameters the adjacency matrices conditional on v.We do not, however, consider such an extension in this paper and leave this topic for future research.

Estimation
In order to measure the individual risk level using the HI-SIR model, we need to know the associated SIR parameters : transmission rate (i.e., β 11 , β 12 , β 21 , β 22 ), and recovery rate (i.e., γ 1 , γ 2 ).Given that we cannot observe the true value of those parameters, it is infeasible to evaluate the objective function ( 9) based on the in-sample information of (H 0 , X) and A of the target network.We therefore assume access to a separate dataset with sample size n or an external study analyzing it, from which we can form estimates for these exogenous parameters.We construct an empirical version of the population welfare (9) and maximize over the feasible allocation policies.To reflect the precision of the SIR parameter estimates in the welfare performance of an estimated allocation rule, we explicitly take into account the sampling uncertainty of the parameter estimates in our derivation of the welfare regret upper bound.

Estimation of SIR Parameters
The estimation of infection rate and death rate always faces severe missing data problems as discussed in Manski and Molinari (2020).Keeling and Rohani (2011) points out that, usually, researchers first estimate the reproductive ratio R 0 , which is the average number of individuals that one sick person infects.
Then, the infection rate can be derived from the estimated recovery rate γ and R0 .In our case, the reproductive ratio is heterogeneous at group level.
where R 0sk is the number of infectious individuals in group s resulting from one sick person in group k.We need to estimate the average number of younger infectious and older infectious from one sick person in group 1 and group 2, and also the recovery rate in each group.
Remark 3.1.We do not discuss what is a desirable procedure for estimating the model parameters in this work, since the choice of estimator depends on the type of data (e.g., Seroprevalence data, Reported cases data, etc.).See Keeling and Rohani (2011) for further details.For the COVID-19 transmissions, estimation of homogeneous R 0 and other SIR parameters has been performed in several papers including Fernández-Villaverde and Jones (2020), Ferguson et al. (2020), andKorolev (2021).They note the difficulty in calibrating critical parameters at an early stage of the pandemic due to the lack of credible data, which motivates partial identification analysis of Manski and Molinari (2020) and Stoye (2021).
Our approach, however, assumes availability of credible point estimates and does not allow identified-set estimates for the SIR parameters.See Ellison (2020) and Akbarpour et al.
(2020) for recent estimates of heterogeneous SIR parameters.

Quadratic Integer Programming
Plugging the parameter estimates into our HI-SIR model, we now have the sample analog of the population maximization problem (9), which is We can formulate this optimization as a quadratic integer programming (QIP) problem, which in the context of an assignment problem over a network is synonymous with the Quadratic Assignment Problem (QAP) of Koopmans and Beckmann (1957).We can express Probability of being healthy (15) where For the probability of being healthy in equation ( 15), there are two linear terms and one quadratic term in v.The first term measures the direct effect of vaccination.A vaccinated unit is safe from infection with 100% probability.The last two terms describe the probability of being free of infection for unvaccinated units.Infected units naturally recover with probability {γ 1 , γ 2 }, which depends on their own characteristics.For those units who are already recovered in the first period, they are free from infection in the second period.The last component takes into account the indirect effect of vaccination.For susceptible units, the probability of being infected by their infected neighbors is summarized by the interaction term.
After removing all the constant parts in equation ( 15), we obtain a simplified objective where Since F n differs from W n only by an additive constant (conditional on the network structure and individual characteristics in the first period), maximizing F n is equivalent to maximizing the original empirical welfare function W n .Therefore, from now on, we will focus on F n (v) as our new objective function.Within F n (v), there is a quadratic term plus linear components in v. Current software is available to solve general QIP problems, such as CPLEX and Gurobi.However, both applications require a symmetric weighting matrix, which does not hold in our case.This asymmetric property comes from the infectious process, since disease can only be transmitted from infected units to susceptible units, but the reverse is not true.We discuss how to solve this QIP problem with showing and exploiting the submodular property of our objective function in the next section.

Submodularity
We showed in the last section that we can formulate our objective function as QAP.This kind of problem is well known as an NP-hard and NP-hard to approximate problem (Cela, 2013).
In general, we cannot solve QAP in polynomial time, which is an issue in practice.We shall, however, show that the quadratic integer programming in our vaccine allocation problem can be linked to the submodular optimization problem.The benefit of submodularity is that there exist off-the-shelf algorithms that can solve a submodular minimization problem in exact In simple terms, submodularity describes the diminishing returns property.The marginal increase in the average probability of being healthy decreases in the number of vaccinated units.This property is crucial for the maximization algorithm.For ease of exposition, we express the simplified empirical welfare F n as a set function with argument V ∈ 2 N , where the binary vector of vaccine allocation v ∈ {0, 1} N and V correspond by where We then denote the class of feasible allocation sets V subject to the cardinality constraint The quadratic functional form of F n shown in (20) can be linked to one classic submodular function called a cut function.Cut functions have been well studied in combinatorial optimization and graph theory.We apply some of the results from that literature (e.g., Bach (2011)).
The proof is shown in the appendix.Note that the necessary and sufficient condition for submodularity shown in this lemma is distinct from negative semidefiniteness of the matrix Ŵ .Since all the parameters in ŵij are non-negative, we must have ŵij ≤ 0, ∀i, j = 1, ..., N .
This immediately leads to the following theorem:

Greedy Maximization Algorithm
Greedy maximization algorithms for submodular functions have been studied and frequently used for well over forty years.The performance guarantee of the algorithm that we study was first introduced by Nemhauser et al. (1978).This algorithm essentially uses the dimin- Algorithm 1: Capacity Constrained Greedy Algorithm In general, there is no performance guarantee of the greedy algorithm.However, as shown by Nemhauser et al. (1978) for a non-decreasing submodular function with cardinality constraint (i.e., capacity constraint in our case), the greedy maximization algorithm is guaranteed to yield an allocation rule where V * ∈ V d is a constrained optimum under the capacity constraint, and α d is a positive constant that depends only on d ≥ 1 and α d ≥ 1/e for all d ≥ 1.This seminal result implies that the greedy maximization algorithm provides a universal optimization guarantee for non-decreasing submodular functions, F n ( V ) ≥ (1 − 1/e)F n ( V * ) ≈ 0.63F n ( V * ).Since we show in Theorem 4.1 that our objective function is non-decreasing and submodular, we obtain the following theorem as an immediate corollary of our Theorem 4.1 and Nemhauser et al. (1978).
Theorem 4.2 (Nemhauser et al. 1978).Let F n : 2 N → R be the simplified empirical welfare function as defined in ( 20) and where is monotonically decreasing in d and converges to 1 − e −1 as d → ∞.

Targeting Constraint
Up until now, we have only considered a simple capacity constraint in the vaccine assignment rule.In reality, Beyond the weight specification in the objective function, policymakers may want to prioritize some group over the others by limiting the number of vaccines that are administered in each group.7For example, policymakers may limit access to vaccines for those people that can work at home.If we are able to divide individuals into two groups based on their job categories, into a group that can work at home and a group that cannot say, then policymakers can set an upper bound on the number of vaccines that are available for the work at home group.
We call this kind of constraint a targeting constraint, and impose it in our model in such way that each of the two age groups has a capacity constraint for the number of available vaccines: This targeting constraint belongs to a general class of constraints : the so called matroid class.First, we use I to describe the subset of 2 N that is compatible with all of the constraints imposed.If we restrict the set of vaccinated agents V to belong to I, which is part of a matroid (Y, I), this constraint is called a matroid constraint.
Definition 4.2 (Matroid).Let I be a nonempty family of allowable subsets of N .Then the tuple (N , I) is a matroid if it satisfies : • (Heredity) For any D ⊂ E ⊂ N , if E ∈ I, then D ∈ I.
• (Augmentation) For any D, E ∈ I, if |D| < |E|, then there exists an x ∈ E\D such that D ∪ {x} ∈ I.
Let N 1 and N 2 be the disjoint subsets partitioned by X i (N 1 ∪N 2 = N ).We can represent the targeting constraint by We can show that this (N , I) is a matroid referred to as a partition matroid.First, we show heredity.For any D ⊂ E, we must have As a result, there must exist an element This problem of optimal treatment assignment subject to a partition matroid constraint is to maximize F n (V ) over V ∈ I.The following Algorithm 2 is guaranteed to produce a solution V ∈ I. Greedy maximization algorithms subject to a partition matroid constraint performed for non-decreasing submodular functions attain at least 50% of the optimal welfare.
Algorithm 2: Targeting Constraint Greedy Algorithm  (Fisher et al. 1978).Let F n : 2 N → R be the simplified empirical welfare function as defined in ( 20) and V * * ∈ arg max V ∈I F n (V ).The greedy maximization algorithm shown in Algorithm 2 outputs V ∈ I such that The performance guarantee of the greedy algorithm with targeting constraint is worse than the performance guarantee of Algorithm 1.This implies a trade-off between additional constraints and the accuracy of computation.In the next section, we discuss the welfare regret bounds of the allocation rules estimated by the above greedy algorithms.

Perfect Treatment Assumption and Submodularity
Recall Assumption 2.2 (Perfect Treatment) : A vaccinated unit enters the Recovered state, regardless of its previous state (i.e., Pr(H 1i = R|v i = 1) = 1).There are three possible ways to relax this assumption : • The recovered units can still spread disease.
• The recovered units will become susceptible after one period (few periods).
• Some percentage of vaccinated units remain susceptible or infected.
In the first case, if the person is recovered at H 0 , she will spread the disease during the first period.In that case, the recovered neighbors of unit i will be taken into account by the infection rate q i .This will not, however, change the sign of our weighting matrix, hence submodularity (by Theorem 4.1) still holds.In the second case, if unit i is recovered in the first period (i.e., H 0i = R), she could become susceptible in the second period (i.e., H 1i = S).
Then, she may be infected in the next period (i.e., H 2i = I).However, we only consider a one time period setting in this work, which rules out this risk.In the third case, varying this percentage only affects the coefficient of the linear term in the objective function (i.e., ĉi in equation 17), which is irrelevant to submodularity.

Regret Bounds
Following Manski (2004) and the subsequent literature on statistical treatment rules, we use regret to evaluate the performance of our algorithm for vaccine allocation.Let F : 2 N → R be the population analogue of F n (•) in ( 20), where the estimated parameters are replaced by the truth.The expected regret measures the average difference in the welfare between using the constrained optimal assignment rule V * ∈ arg max V ∈V d F (V ) and using the constrained estimated greedy algorithm V obtained from Algorithm 1: where E P n is the expectation with respect to the sampling uncertainty of the parameter estimates in the external studies.
In this work, we assume that consistent estimators of effective contact rate and recovery rate are available from other studies.Generally, there is no requirement on the estimator except that Assumption 5.1 needs to hold. (28) where P is the sampling distribution in another study that has sample size n.
The above assumption is an exponential tail bound obtained by applying Hoeffding's large deviation inequality (Hoeffding, 1963).Since β sk is the effective contact rate of the disease between group s and k, and γ s is the recovery rate in group s, both are naturally bounded in [0, 1].Hence, common estimators (e.g., sample analog) meet the above condition.However, other tail bounds might apply for some other estimators, which do not necessarily have the same form as the above tail bound.Our approach can accommodate various tail bounds, such as the tail bound associated with the maximum likelihood estimator (Miao, 2010).
The estimators for the contact rates and recovery rates may come from different studies with different sample sizes.In this case, we can view n in Assumption 5.1 as the smallest sample size among the studies.
In order to derive the uniform convergence rate of the welfare regret, we decompose regret into three components as follows. where , and V is the output from the greedy maximization algorithm under the capacity constraint.Therefore, 1 describes the regret we would attain if the constrained optimum could be computed exactly. 2 measures the welfare loss introduced by the greedy algorithm.3 indicates the loss from using the estimated objective function instead of the true objective function.We compute the upper bound of each component separately and then combine them.
First, we start from the derivation of the upper bound of 1 .This part is similar to the approach in Kitagawa and Tetenov (2018).Before looking at V * , consider the following inequality, which holds for any V ∈ V d : Since the above inequality applies to F ( V ) for all V , it also applies to V * : For the second component, we can obtain an upper bound by applying Theorem 4.2 : Similarly to the first component, the third component can be bounded as : Combining all the previous results, we obtain the upper bound of regret : Compared with the regret upper bound when one could compute V * , the regret upper bound shown in (35) has one additional term ).This additional term comes from equation ( 33) and captures the welfare loss induced by the use of greedy algorithm.As we characterize below, the first term converges to zero as n → ∞ under Assumption 5.1, while the second term remains independent of the accuracy of the parameter estimates.A simulation study in Section 6 assesses the magnitude of the optimization error of the greedy algorithm, and shows numerically that the greedy algorithm yields an exact optimum for small network cases (N = 35) at least.Based on this, we believe that the optimization error term of the greedy algorithm is much smaller than the universal theoretical bound 1 e F (V * ).In the partition matroid (targeting constraint) case, by applying Proposition 4.1 and repeating the arguments to derive (35), we obtain where V * * is an oracle optimum under the targeting constraint, V * * ∈ arg max V ∈I F (V ).
In order to bound sup V ∈V d F n (V ) − F (V ) , we use the triangle inequality to find the bound of F n (V ) − F (V ) : where the absolute value of a matrix or vector stands for the element-wise absolute values.Therefore, we can decompose the maximal deviation sup Under Assumption 5.1, we can obtain an upper bound for the mean of each element in Ŵ − W and Ĉ − C, as shown in the next lemma.
Lemma 5.1.Under Assumption 2.1, 2.2, and 5.1, we have Combining this lemma with equations ( 35) and ( 38), we obtain the following theorem: Theorem 5.1.Let N M = max i∈N |N i |, N I be the total number of infected units, and g = max i∈N g i .Under Assumptions 2.1, 2.2, and 5.1, we have where C is a universal constant and d is the number of available vaccine doses.
Proof of the above theorem is shown in the appendix.In Theorem 5.1, we provided a distribution-free upper bound on the expected regret.We show that the convergence rate of the upper bound depends on the network data sample size N and also the sample size n for estimating the SIR parameters.At the same time, the regret upper bound is increasing in the complexity and the riskiness of the network.The intuition is that our algorithm finds it harder to identify the most valuable units when the maximum number of edges and the number of infected individuals in the network increases.The maximum individual weight g also boosts the upper bound of regret.Moreover, our algorithm finds it harder to identify the best allocation rule when the number of possible combinations increase, which occurs when the capacity constraint is relaxed.This also implies the benefit of quarantine.Since quarantine controls the maximum number of connections in the network, the effectiveness of vaccine allocation is boosted by such government policy.Therefore, there is advantage to complementing a vaccine assignment policy with quarantine, which is evidenced by our simulation exercises.

Simulation Exercises
In this section, we use an Erdös-Renyi model to generate random social networks.In each of the following tables, we use 100 different networks and take the average of the outcome variable across all of the networks.We further show the standard deviation of in-sample welfare to understand the variation of network structure.We choose the probability of allocating a unit to group 1 to be 40% and the probability of allocating a unit to group 2 to be 60% (i.e., P(X i = G 1 ) = 0.4 and P(X i = G 2 ) = 0.6).In the epidemiological literature, researchers usually find the steady state of the SIR parameters.In order to identify the impact of varying the SIR parameters, we choose two different sets of parameter values to run the simulation.Throughout our simulation studies, we do not consider sampling errors in the parameter estimates and focus on optimizing welfare with the true parameter values plugged in.Table 1 summarizes all the values of the SIR parameters that we have used.In addition, we choose three different densities, 0.1, 0.5 and 1, in order to identify the effect of network complexity.Here, density = 1 means that the network is fully connected (i.e., complete graph).We choose full to understand the behaviour of our heuristic algorithm not only in the sparse network case but also in the densest case.We also compare three capacity constraints, d = 7%N, 10%N, 20%N , to evaluate the marginal performance gain of our greedy algorithm.We choose equal weight in the following comparisons.We, however, show the impact of changing weights on the number of vaccinated younger units in Table 5.
In the following sections, we compare our greedy algorithm with three familiar allocation rules.We first compare our algorithm with a brute force method in order to find the difference between the potentially sub-optimal greedy solution and the brute-force optimal solution.However, the number of possible combinations dramatically increases with the number of nodes and the capacity constraint.We cannot use a large number of agents to compute the brute force optimum in the simulation.Given this, in Section 6.2, we use a random assignment rule as a baseline to evaluate the performance of our algorithm in a large network setting.The third allocation rule that we compare our greedy algorithm with is an allocation rule which assigns the vaccine without considering network information.We compare the greedy algorithm with this third rule in Section 6.3.

Comparing with Brute Force
Allocation Rule Greedy Algorithm Brute Force8 Capacity Constraint  The value of welfare (the sum of probabilities of being healthy in the second period) averaged over 100 random networks (standard errors in parentheses).We use the Greedy Algorithm or the Brute Force algorithm to determine who in each network should be vaccinated.
Since Theorem 4.2 shows the gap between the optimal solution and the heuristic result is at most 37%, we want to explore this theoretical difference using numerical study.We list all the possible combinations and use brute force to search for the optimal solution given a manageable number of units.We specify the maximum number of units to be N = 35, which is limited by computer performance.As the number of nodes increases, the possible number of combinations grows exponentially.The memory requirement and running time become impractical in a more realistic case.We recognize that the results from a small network may not be accurate in a large network setting, but help us to understand the regret of our greedy algorithm to some degree.We summarize the in-sample welfare W n of these two approaches in Table 2.
In the small network case, we find that our greedy algorithm finds optimal allocation rules in all cases that we consider, which indicates a good performance of our method.We also notice that the welfare that is associated with the optimum decreases with the number of edges.As we relax the capacity constraint, welfare increases rapidly.The main purpose of this comparison is to get an idea of how much worse the empirical welfare at the greedy solution can be relative to the brute force optimum.More results are illustrated in the following two sections.

Comparing With Random Assignment
In this section, we use a random assignment rule to define the baseline of vaccine allocation.
We randomly draw an allocation 10, 000 times and calculate the average value of the outcome variable.Random allocation is one common assignment rule for policymakers.The purpose of this simulation is to learn about the improvement of our greedy allocation rule.In order to evaluate its performance in a relatively large network setting, we choose N = 500 and 800.Table 3 records the main differences in terms of in-sample welfare between these two methods.
From Table 3, we find that the performance of both methods decreases with the number of edges, which is also true for the first comparison.As the number of edges increase, the greedy algorithm finds it harder to identify who is relatively crucial in the network, which supports our interpretation of Theorem 5.1 in the previous section.This effect becomes more pronounced as the capacity constraint is relaxed.In the most extreme case, when everyone is connected with each other, the performance of our method is still better than the random assignment rule.This performance gap widens with the capacity constraint.We also find that the average welfare increases by 12% when the capacity constraint increases by 0.1N .
Moreover, this improvement is robust with respect to the variation of number of nodes and the changes of density levels of network.The number of nodes decreases the performance of our method in a sparse network setting.For N = 800, welfare in the densest network is 14% lower than the welfare with density = 0.5, no matter which capacity constraint and parameter set we use.

Allocation Rule
Greedy Algorithm Random Assignment9  Table 3: The value of welfare (the sum of probabilities of being healthy in the second period) averaged over 100 random networks (standard errors in parentheses).We use the Greedy Algorithm or the Random allocation to determine who in each network should be vaccinated.

Capacity Constraint
If we look at the random assignment rule in Table 3, its performance is much worse than the performance of the greedy algorithm.This difference increases when the complexity of and the number of nodes in the network increase.The performance of the random assignment rule improves as we relax the capacity constraint.However, this improvement is only about 7% when the capacity constraint increases by 0.1N.Compared with the greedy algorithm, random assignment is less effective.Given its scarcity, we waste considerable resources by randomly assigning the vaccine.Looking at the situation of full edges, the performance of random allocation is inferior.The ratio of the welfare attained by random allocation to the welfare attained by the greedy algorithm is illustrated in Figure 1.This ratio increases slowly with the number of edges and deceases with the number of nodes in the network.In addition, the ratio decreases in an obvious way with the number of vaccines that are available.experimental data without network structure information to study the optimal policy.As a result, the allocation regime assigns the treatment without considering spillover effects, which could lead to a sub-optimal result.We call this kind of regime Targeting Without Network Information (TWNI).In this simulation, we want to learn the welfare loss from using TWNI versus our method.
Generally, TWNI assigns treatment based on personal characteristics.In this study, we only have one covariate : age.This means either the old group receives the vaccine or the young group receives the vaccine.Under the previous setting (i.e., older people are more likely to be infected and to die), group 2 will consume the entire vaccine allocation.Given different capacity constraints, this assignment rule selects units to be vaccinated from group 2 until the upper bound is reached.Table 4 indicates the results for TWNI allocation are similar to those for random allocation.In addition, despite the outcome value varying with the SIR parameters, the sizable improvement from using network information to allocate vaccination is quite robust to variations in the size and density of network.Our numerical study shows that if the number of available vaccine doses is small, the loss from ignoring network information is relatively small too (around 4%).This loss increases dramatically, however, with the number of available vaccines.In addition, the performance gap between our greedy algorithm and the other two allocation methods decreases with the network complexity (i.e., the number of edges).Under what might be described as a lockdown policy, the density of the network is maintained at a relatively low level, which raises the cost of ignoring spillovers.This cost also increases with the number of units in the population, which is a problem in a more realistic setting.The performance improvement from considering network information is robust to variation of the SIR parameters, and an allocation rule which ignores spillovers waste a sizeable proportion of a scarce resource.
In Table 5, we illustrate the impact on the percentage of vaccinated younger units by varying the weight choice g i (In this simulation exercise, we choose equal weight for the units in same group).If we assign weight g 1 = 1.5 for G 1 , we find all the vaccines are consumed by younger units.Comparing with the equal weight case, this number changes dramatically.
Moreover, we find our greedy algorithm offers more vaccines to younger units in the case of parameter set 2 than parameter set 1, i.e., when the transmission rate parameters are higher within and across the groups.

Weight Choice
Weight

Conclusion
In this work, we have introduced a novel method to estimate individualized vaccine allocation rules under network interference.We introduce the heterogeneous-interacted-SIR model to specify the spillover effects of infectious disease.We show that the welfare objective function of the vaccine allocation problem is non-decreasing and submodular, and so is its empirical analogue formed by plugging in the estimates of the SIR parameters.Based on this specific diminishing returns property, we provide a greedy algorithm with performance guarantee under two different exogenous constraints, which can easily accommodate various targets that policymakers commonly face in reality.Moreover, we show that this algorithm implies an upper bound for regret that converges uniformly at O(n −1/2 ).Using simulation, we point out the importance of considering network information in the allocation problem.
Several open questions and extensions are worth considering in future work.First, this paper considered a one-time vaccine allocation.We did not consider if there are multiple allocation periods, and how to decide the allocation dynamically.A relevant important question is how to jointly optimize allocations and timing of first-and second-doses of vaccines, as recently discussed for Covid-19 vaccines in Maier et al. (2021), Tuite et al. (2021), and Wang et al. (2021)).Moreover, we do not study how the vaccine allocation rule impacts on the outcome variables after multiple periods.As discussed in Bu et al. (2020), changes to the network structure should be considered in a dynamic setting.Second, we only compare the greedy algorithm with the brute-force optimum in a small network.Other than the universal bounds of Theorem 5.1, we do not know the performance of our method relative to the optimal solution in the large network data setting.Third, we did not impose any other constraints than the capacity constraint and the targeting constraint.For interpretability and fairness, we may want to additionally restrict the policy rule as a simple function of observed covariates.We regard these as interesting questions that are worthy of consideration.

Appendix A The Transmission Term
Consider a susceptible individual i with κ s contacts which depends on his own characteristics at each period.Of these contacts, a fraction j∈N i I j (1−v j )a j /|N i | are contacts with infected neighbors from group 1, and a fraction j∈N i I j (1 − v j )b j /|N i | are contacts with infected neighbors from group 2. If we define c ij as the probability of successful disease transmission at each contact, then 1 − c sk is the probability that transmission between group s and group k does not take place.Therefore, we have the probability that a unit i is not infected in one time period : We now define β sk = −κ s ln(1 − c sk ) and plug it into the expression for 1 − q i , which allows us to rewrite the above equation as : where Recalling that e x = 1 + x + x 2 2! + x 3 3! + • • • , we now have the probability of infection at each time period is q i z.
(44) Therefore, we have : Combining the previous results, we get : (51) If we take the absolute value and expectation of each side, by the triangle inequality, we get Since β sk is the effective contact rate of the disease between group s and k, it is naturally polynomial time and approximately solve a submodular maximization problem with capacity constraint in polynomial time.The seminal result of Nemhauser et al. (1978) provides a universal bound for the quality of approximation as detailed below in Section 4.2.Definition 4.1 (Submodular function).Let N = {1, 2, . . ., N }.A real-valued set-function F : 2 N → R is submodular if and only if, for all subsets A, B ⊆ N , we have :

Theorem 4. 1 .
The objective function F n (V ) is a non-decreasing submodular function for any adjacency matrix, covariate values, and parameter estimates.Theorem 4.1 is the key result in our paper.It describes two important properties of our objective function; monotonicity and submodularity.We exploit these two properties to justify the uses of greedy maximization algorithms shown in the next subsection.
returns property of the submodular function.The idea is to iteratively select the most valuable element until the capacity constraint is reached.At each round, the algorithm evaluates O(N ) functions to identify the marginal gain of each element.The number of rounds depends on the capacity constraint d.As a result, the computational complexity of the greedy algorithm is of order O(N • d), well below the computational complexity of the brute-force search.Algorithm 1 presents the greedy maximization algorithm applied to maximization of the empirical welfare (20).
capacity constraint d; 2 : Initialization : Starting from the empty set V = ∅ ; if |V | < d then 3 : for each i ∈ N \V do 4 : Compute the marginal gain F n (V + {i}) − F n (V ); 5 : Select i which maximizes the marginal gain and add it into the set V ; else return the set V ; end then it means D must satisfy the targeting constraint in I. Next, for any D, E ∈ I, capacity constraint d, and targeting constraints d 1 , d 2 ; 2 : Initialization : Starting from the empty set V = ∅ ; if |V | < d then 3 : for each i ∈ N \V do 4 : Compute the marginal gain F n (V + {i}) − F n (V ); 5 : Sort i in order of decreasing marginal gain 6. if j∈V a j + a i(1) ≤ d 1 ∩ j∈V b j + b i(1) ≤ d 2 then 7 : Add the 1st element of i into V ; else 8 : Repeat step 6 with remaining i; end else return the set V ; end Proposition 4.1

Figure 2 :
Figure 2: Comparison between Greedy Algorithm and Targeting Without Network Information since F n (∅) = 0. Now, we have shown that F n (V ) is a cut function.The next step is to find the sufficient and necessary conditions for submodularity of the cut function.Lemma B.1 indicates, for any cut function which can be written as a quadratic function plus a linear part, submodularity holds if and only if all off-diagonal elements of the weighting matrix are non-positive.That requires ŵij ≤ 0, ∀i = j.We first prove the upper bound ofE P n ŵij − w ij .ŵij − w ij = S i g i A ij I j |N i | N (β 11 − β11 )a i a j + (β 12 − β12 )a i b j + (β 21 − β21 )b i a j + (β 22 − β22 )b i b j .
Assumption 5.1.Let βsk denote the estimate of effective contact rate between group s and group k, and γs denote the estimate of recovery rate in group s.The following properties need

Table 1 :
Summary of the SIR parameter values Parameters set 1 set 2 Parameters set 1 set 2

Table 4 :
The value of welfare (the sum of probabilities of being healthy in the second period) averaged over 100 random networks (standard errors in parentheses).We use the Greedy Algorithm or the Targeting Without Network Information allocation to determine who in each network should be vaccinated Usually, in the literature on treatment assignment, researchers use observational data or

Table 5 :
The percentage of vaccinated younger units in the second period under the vaccine allocation policies obtained by Greedy Algorithm, averaged over 100 random networks.We choose three different sets of weights in this comparison