Dynamic fair balancing of COVID-19 patients over hospitals based on forecasts of bed occupancy

This paper introduces mathematical models that support dynamic fair balancing of COVID-19 patients over hospitals in a region and across regions. Patient flow is captured in an infinite server queueing network. The dynamic fair balancing model within a region is a load balancing model incorporating a forecast of the bed occupancy, while across regions, it is a stochastic program taking into account scenarios of the future bed surpluses or shortages. Our dynamic fair balancing models yield decision rules for patient allocation to hospitals within the region and reallocation across regions based on safety levels and forecast bed surplus or bed shortage for each hospital or region. Input for the model is an accurate real-time forecast of the number of COVID-19 patients hospitalised in the ward and the Intensive Care Unit (ICU) of the hospitals based on the predicted inflow of patients, their Length of Stay and patient transfer probabilities among ward and ICU. The required data is obtained from the hospitals’ data warehouses and regional infection data as recorded in the Netherlands. The algorithm is evaluated in Dutch regions for allocation of COVID-19 patients to hospitals within the region and reallocation across regions using data from the second COVID-19 peak.


Introduction
Confronted with a pandemic or the outbreak of a severe and highly contagious disease on a national level, governments and organisations must implement appropriate countermeasures [1] . Accurate estimation of disease prevalence is essential for monitoring and decision making [2] , as is optimal distribution of vaccines [3,4] . Despite these effort s, hospit als may be overwhelmed by infected patients. These patients arrive in addition to hospitals' regular patients, increasing the strain on hospital staff and resources [5,6] . Alternative resources such as backup and field hospitals or student nurses may offer additional capacity [7] . Despite such measures, hospitals may have no other option than to temporarily decrease the number of regular patients treated [8,9] . A reduction in regular care has serious consequences, in particular for oncology patients and others whose condition may worsen irreversibly if treatment is postponed [10] , but also in other medical specialties healthy life years are lost due to fewer treatments [9] . Hospitals may defer infected patients to other hospitals facing a less severe surge in infected patients, either inside or outside their region, to avoid exceeding their maximum capacity [11] . Next to being unavoidable in certain cases, redistributing patients may also balance the staff work pressure across regions and may avoid potentially unethical differences in the accessibility of regular care across regions. Redistributing infected patients according to a fair balancing policy provides the opportunity to share and thus reduce the burden of the pandemic on regular patients as well as hospital staff.
When the COVID-19 pandemic reached the Netherlands, the Dutch government erected a national coordination centre for patient reallocation (in Dutch: 'Landelijk Coördinatie-centrum Patiëntenspreiding', LCPS) with exactly these aims [12] . To fulfil its mission, LCPS cooperates with the twelve 'ROAZ' regions of the country, where each ROAZ (in Dutch: 'Regionaal Overleg Acute Zorgketen') region has its own consultative body for the acute care chain [13] . When a hospital requests to reallocate one of its COVID-19 patients, other hospitals within the same region are considered first, as an intra-regional reallocation is the least burdensome for the patient, his or her relatives, and the ambulance transportation service. If an intra-regional reallocation is not possible, the patient is reallocated to another region in the Netherlands, or to Germany as a last resort. As outlined in Bekker et al. [14] , reallocation of patients is based on separate forecasts of the COVID-19 bed occupancy at the ward and Intensive Care Unit at the regional level, but neither a detailed model to fairly balance patients across regions taking into account these predictions, nor a model to fairly allocate patients to hospitals within a region taking into account bed occupancy predictions per hospital are included.
To facilitate the reallocation process, hospitals are required to report their bed surplus that is available for COVID-19 ward and ICU patients daily at 10 AM. Physicians report the bed surplus by observing the current situation and likely incorporating some form of safety margin. Bed surplus and bed shortage are influenced by several factors. The surplus of COVID-19 ward beds, for example, increases due to discharges and transfers to the ICU, while it decreases by transfers from the ICU, possible reallocations from other hospitals and new admissions, where the latter depends on the number of COVID-19 infections in the region some days ago. It is infeasible for a physician to gauge and incorporate the combined effect of all these factors when reporting the bed surplus. Consequently, the number of patient reallocations will likely be higher than necessary. For example, a hospital may report ICU bed surplus today and thus receive a COVID-19 ICU patient from another hospital, while that results in ICU bed shortage two days from now, necessitating to reallocate a COVID-19 ward patient that needs ICU care.
This paper presents mathematical models and resulting decision rules that support fair balancing of COVID-19 patients over hospitals in a region and across regions. In these models, the flow of COVID-19 patients is captured in a network of infinite server queues. The first model, at the regional level, is a load balancing model that supports dynamic fair balancing of COVID-19 patients over hospitals in a region. The second model, at the inter-regional level, is a stochastic program that minimises the costs of patient reallocations across regions. Input for the models is the inflow of patients, their Length of Stay (LoS) in the ward and ICU and transfer of patients between these units. To this end, our method is augmented by accurate statistical methods to predict patient arrivals, estimate LoS and transfer probabilities, and forecast the number of COVID-19 patients hospitalised in the ward and ICU of a hospital. Our results are cast in real-time decision rules for patient allocation to hospitals in a region or reallocation across regions based on safety levels that determine the bed surplus or bed shortage for each hospital or region during the next couple of days.

Literature
Dynamic load balancing Our fair balancing of COVID-19 patients over hospitals within a region falls in the class of load balancing methods that are well-known in communication systems, see, e.g., Ross [15] , Zachary and Ziedins [16] , van der Boor et al. [17] for an introduction to basic load balancing scenarios, that states: "Load balancing can be broadly categorised as static, dynamic, or some intermediate blend, depending on the amount of state information that is taken into account". Below, in line with our approach, we focus on dynamic load balancing. Dynamic load balancing algorithms aim at improving the system throughput and reducing the job response time by relocating application tasks among the nodes using information on the instantaneous system load to decide how to relocate the jobs [18] . Sender-initiated strategies (congested nodes push work to lightly loaded nodes) outperform receiverinitiated strategies (lightly loaded nodes pull work from highly loaded nodes) at light to moderate system loads, whereas receiverinitiated strategies are preferable at high system loads [18,19] . Observe that dynamic load balancing may be hindered by incomplete state information [20,21] . Exact performance analysis of dynamic load balancing policies is argued to be difficult due to multidimensional state spaces, see [22,23] .
Dynamic load balancing algorithms are developed for systems with stationary arrival and service rates, including web servers [24][25][26] and large (virtual) call centres [27,28] . Typically, these methods involve large Erlang loss systems and servers with multiple skills as in call centres, involve a queue as in web servers, or introduce replicas of jobs that are sent to join the queue at different servers and upon completion of service of the first replica delete all other replicas. Typical ingredients for load balancing approaches are setting and adjusting routing probabilities [29] or routing policies [22,30,31] and job migration algorithms [32] . The Join-the-Shortest-Queue (JSQ) policy is a centralised dynamic load balancing algorithm, where a dispatcher must immediately forward tasks upon arrival to one of the servers. Implementation of JSQ policies becomes difficult when the number of stations becomes large, in which case asymptotic methods may be used [17,20,21,33] .
Our dynamic load balancing method is developed for systems under high load that do not allow for queueing, in which case a receiver-initiated strategy is preferable, which is implemented via a centralised approach to allocate patients to hospitals. We assume complete state information and use simulation to assess performance measures. Our fair balancing method across regions is a stochastic program taking into account forecasts of the future bed surplus or shortage. It is related to mixed integer recourse models [34,Chapter 3] . Typically, solving such models requires scenarios or pre-specified input processes and does not include dynamical statistical methods for real-time forecasts of occupancy and safety levels which is included in our approach. We further show that our stochastic program may be well approximated by a mixed integer program (MIP) that facilitates a direct relation with the way hospitals report bed shortages/surpluses. Forecast and queueing model Our fair balancing method requires accurate forecasts of the patient arrival rates. Several studies have developed prediction models for the number of hospitalised COVID-19 patients. Focusing solely on predicting ICU occupancy, Farcomeni et al. [35] , Goic et al. [36] , Manca et al. [37] , Massonnaud et al. [38] develop prediction models at the regional level, while [39] provides predictions for individual hospitals. Other studies predict both COVID-19 ward and ICU occupancy [14,[40][41][42][43] . The prediction in Roimi et al. [42] , Zhao et al. [43] is based on regression analysis or epidemic models. Queueing models for predicting ward and ICU occupancy in the Netherlands at the regional and national level are developed in Bekker et al. [14] ; their models do not incorporate patient transfers from ward to ICU and vice versa. Transfer probabilities between COVID-19 ward and ICU are derived based on a Markov chain analysis in Foucrier et al. [41] . In a previous paper, Baas et al. [40] , we have developed forecasts of COVID-19 ward and ICU occupancy at the individual hospital level, incorporating patient transfers between the ward and ICU. That method uses a Richards' curve [44] to predict the arrival rates of COVID-19 patients, a Kaplan-Meier estimator [45] to estimate the distribution of the LoS in both the COVID-19 ward and ICU, and we sample patient trajectories in the Poisson Arrival Location Model [46] that determines the queue occupancy in a network of infinite server queues representing the COVID-19 ward and ICU. In this paper, we build on our previous work by significantly improving the forecasts of ward and ICU occupancy and by using these forecasts as a basis for decision rules that facilitate fair balancing of COVID-19 patients over hospitals.

Contribution
Our contribution in this paper is threefold. First, we develop a load balancing method that incorporates bed occupancy fore-casts to fairly balance COVID-19 patients over hospitals in a region. These results extend load balancing results in literature to incorporate forecast occupancy and safety levels. Second, we propose a stochastic program taking into account scenarios of the future bed surpluses or shortages to optimally distribute COVID-19 patients that cannot be accommodated within a region over multiple regions taking into account travelling distances and other differences between regions; the currently available COVID-19 bed-capacity in each region; as well as scenarios of the maximum bed-occupancy over several days. Third, as our models require accurate forecasts of the COVID-19 patient arrival rate, we extend the results of Baas et al. [40] on the prediction of the arrival rate in the network of infinite server queues that we use in the bed occupancy forecasts. Originally, in Baas et al. [40] , a Richards' curve was fit to data from the hospitals' data warehouse to predict the COVID-19 patient arrival rate, while in this paper we develop a time-delaying and filtration procedure applied to the exponentially weighted moving average of regional infection data, which results in more accurate bed occupancy forecasts. Section 2 presents our hierarchical modelling and decision approach. We propose decision rules for individual hospitals in Section 2.1 , within a region in Section 2.2 , and across regions in Section 2.3 . Our load balancing procedure is based on arrival rate scaling, see Appendix A . Appendix D summarises notation used in these models. Section 3 presents our prediction of COVID-19 patient arrival rates from regional infection data and considers the accuracy of our bed occupancy forecasts. Section 4 presents numerical results and illustrates the performance of our models and decision rules: Section 4.1 presents the inter-regional patient allocation results and Section 4.2 considers reallocating patients across regions. Section 5 concludes the paper.

Model and decision rules
This section presents our hierarchical modelling and decision approach. Section 2.1 briefly reviews the essential elements of our model for forecasts of the occupancy in the ward and ICU at an individual hospital [40] , and introduces a decision rule to determine available capacity for patients from other hospitals. Section 2.2 introduces a load-balancing rule for allocation of patients to hospitals in a region and a decision rule when combining hospitals into one regional hospital exploiting the statistical multiplexing gain [15] . Section 2.3 introduces a recourse model to optimally distribute COVID-19 patients that cannot be accommodated within a region over multiple regions taking into account travelling distances, the currently available COVID-19 bed-capacity in each region, as well as the forecast maximum bed-occupancy over several days.

Bed surplus or shortage for an individual hospital
Consider a hospital with dedicated COVID-19 ward and ICU, indexed by h , that admits a fraction of COVID-19 patients from its service region as determined by the regional number of infected patients (autonomous arrivals). Following [40] , we model the hospital as a network of two infinite server queues that records the number of hospitalised COVID-19 patients. The network includes patient-characteristics c ∈ C (e.g., age, gender, weight) that may affect the hospitalisation rate and the patient journey in the hospital, a time-dependent Poisson arrival process with rate λ hc (t) , with fraction p hc (t) admitted to the ward, fraction 1 − p hc (t) to the ICU, general and time-dependent LoS L hcW (t) and L hcI (t) at ward and ICU, and discharge probabilities q hcW (t) , resp. q hcI (t) . The assumption of Poisson arrivals was shown to be justified for arrivals to Emergency Departments [47] . The number of patients N hcW (t) and N hcI (t) with characteristics c at time t have a time-dependent Poisson distribution with means ρ hcW (t) , ρ hcI (t) that are determined via the Poisson Arrival Location Model (PALM), see [46, Theorem 2.1] . The total number of patients in the ward, N hW (t) , and ICU, N hI (t) , have time-dependent Poisson distributions with means ρ hW (t) = c∈ C ρ hcW (t) , and ρ hI (t) = c∈ C ρ hcI (t) , see [40] .
The Poisson distributions for N hW (t) and N hI (t) allow us to explicitly evaluate relevant performance measures. Let L h (s ) be tuples of the location and realised LoSs (up to time s ) of all patients residing in hospital h . The expected occupancy at time s + t is: The expected maximum occupancy in [ s, s + t] is: The α hW -quantile, n hW,α hW (s, t ) , and the α hI -quantile, n hI,α hI (s, t ) , for respectively the maximum occupancy in the ward and ICU at hospital h in [ s, s + t] , determine the required capacity n hW,α hW (s, t ) , n hI,α hI (s, t ) in the ward and ICU to accommodate all autonomous arrivals in [ s, s + t] with probability at least α hW , α hI , respectively. We will refer to α hW , α hI as the safety levels .
Let n * hI (s, t ) be the number of beds in the ICU of hospital h in the time-interval [ s, s + t ] . If n hI,α hI (s, t ) < n * hI (s, t ) we may argue that at safety level α hI a number of beds ˜ n hI,α hI (s, t ) = n * hI (s, t ) − n hI,α hI (s, t ) may be considered unoccupied in [ s, s + t] . This bed surplus may then be allocated to COVID-19 patients from other hospitals. If n hI,α hI (s, t ) ≥ n * hI (s, t ) , hospital h may be confronted with bed shortage (at safety level α hI ), and will not offer any beds to ICU patients from other hospitals. A similar reasoning applies to the ward. Note that we do not assume a fixed number of beds, but include dependence on s, t in the number n * hW (s, t ) , n * hI (s, t ) of beds in [ s, s + t] . This allows the number of beds to be scaled up or down over time. We arrive at the following decision rule to determine the bed surplus. We introduce the following notation. Let Decision Rule 1 (Individual hospital) . Consider hospital h , with n * hW (s, t ) , n * hI (s, t ) beds in the ward resp. ICU in [ s, s + t] . Consider safety levels α hW , α hI . Let n hW,α hW (s, t ) , n hI,α hI (s, t ) be the α hW -, α hI -quantiles for the maximum occupancy at hospital h , (see (3) ). At safety levels α hW , α hI , hospital h has bed surplus of ˜ n hW,α hW (s, t ) = n * hW (s, t ) − n hW,α hW (s, t ) The bed surplus is determined at safety levels α hW , α hI . These levels may include the hospital's policy for bed allocation, treatment protocols, or case mix, but may also incorporate the size of the hospital.
As an extreme case, we may set α hW = 0 to indicate that we only focus on overcrowding of the ICU at hospital h .

Patient reallocation within an individual region
This section extends the decision rule for an individual hospital to a decision rule for an individual region to allocate COVID-19 patients among the hospitals in that region during a time interval [ s, s + t] . Consider a region, indexed by r, containing H r hospitals, indexed h = 1 , . . . , H r , where each hospital is modelled as described in Section 2.1 . We will first determine the autonomous arrival rate for each hospital as fraction of the regional patients in Section 2.2.1 . Then, in Section 2.2.2 , we determine for each hospital the bed surplus according to Decision Rule 1 and subsequently present a load-balancing Decision Rule 2 based on the bed surplus for the region. In Section 2.2.3 , we assume that hospitals disclose all information on the number of beds and hospitalised COVID-19 patients, which allows viewing the COVID-19 wards and ICUs in a region as single merged ward and ICU. This may be viewed to correspond to a regional coordination centre that optimally assigns patients to hospitals, resulting in a lower bound on the number of patients reallocated out of a region.

Autonomous arrival rates
Let rc (u ) denote the time-dependent rate at which COVID-19 patients with characteristics c ∈ C request to be hospitalised in region r, and let r (u ) = c∈ C cr (u ) be the total arrival rate of COVID-19 patients in region r. Consider hospital h , with n * hW (s, t ) , n * hI (s, t ) beds in the ward and ICU in [ s, s + t] . At safety levels α hW , α hI , hospital h may admit a fraction θ h,α hW ,α hI (s, t ) of these regional patients in the ward and ICU such that the ward can accommodate all autonomous arrivals with rate θ h,α hW ,α hI (s, t ) r (u ) in [ s, s + t] with probability at least α hW , and similarly for the ICU. Let P θ ,h denote the distribution of the number of patients for hospital h in [ s, s + t] given arrival rates λ hc (u ) = θ rc (u ) , u ∈ [ s, s + t] , c ∈ C. Then θ h,α hW ,α hI (s, t ) may be determined as θ h,α hW ,α hI (s, t ) = max θ : Let θ r,αr (s, t ) = H r h =1 θ h,α hW ,α hI (s, t ) , with α r = { α hW , α hI : h = 1 , . . . , H r } be the set containing all safety levels of region r.
If θ r,αr (s, t ) < 1 , region r has insufficient capacity to accommodate all arrivals during [ s, s + t] at safety levels α r . Hospital h admits the fraction θ h,α hW ,α hI (s, t ) of autonomous regional patients arriving in [ s, s + t] corresponding to its safety levels α hW , α hI .
Thus, the autonomous arrival rate of patients with characteristics c for hospital h is At safety levels α r , the remaining fraction of patients arriving at rate [ 1 − θ r,αr (s, t ) ] r (u ) must be accommodated in hospitals outside region r.
If θ r,αr (s, t ) ≥ 1 , region r has sufficient capacity to accommodate all autonomous arrivals. A fair or load-balancing distribution of COVID-19 patients over the hospitals in the region according to their safety levels α r is obtained by admitting the fraction θ h,α hW ,α hI (s, t ) /θ r,αr (s, t ) of patients in hospital h . Thus, the autonomous arrival rate of patients with characteristics c for hospital h is In Section 2.2.2 , we use (6) and (7) to determine the region's ward and ICU bed shortage or bed surplus.
Remark 2 (Fraction of admitted regional patients) . Observe that θ h,α hW ,α hI (s, t ) in (5) determines the fraction of patients admitted in hospital h irrespective of admittance to ward or ICU. This is a natural choice, as in most cases hospitals first admit regional patients and then perform triage (i.e., determine whether the patient is admitted to the ward or ICU). Our model includes transfers between ward and ICU that may occur in [ s, s + t ] , so that θ h,α hW ,α hI (s, t ) yields the fraction of admitted patients at the safety levels of the ward and ICU .
Observe that the arrival rates λ hc (u ) in (6) and (7) imply that the probability that hospital h cannot accommodate its autonomous arrivals in [ s, s + t] at its ward, resp. ICU, is at most (7) , these probabilities will most likely be even smaller, as the arrival rates λ hc (u ) are smaller than the arrival rates θ h,α hW ,α hI (s, t ) rc (u ) that hospital h can accommodate at safety levels α hW , α hI .
If θ r,αr (s, t ) < 1 , it may occur that hospital h could admit patients to its ward at higher rates than under λ hc (u ) (from (6) ), but not to its ICU, or vice versa. This follows from (5) , as, at some point, either the ward or the ICU is the bottleneck to further in- (5)  We may extend our model to include different θ hW,α hW (s, t ) and θ hI,α hI (s, t ) for a hospital ' s ward and ICU, respectively, defined as

one of the two inequalities in
This may allow more flexibility in accepting COVID-19 patients in the ward when the ICU has reached its capacity, and vice versa, but may also result in overcrowding of either the ward or the ICU .

Load balancing rule to fairly allocate patients to hospitals
Section 2.2.1 has established whether or not the hospitals in a region r may admit all autonomous arrivals at safety levels α r . If θ r,αr (s, t ) < 1 , the autonomous arrival rates (6) determine the region's bed shortage in [ s, s + t ] at safety levels α r . If θ r,αr (s, t ) ≥ 1 , the rates (7) determine the hospitals' bed surplus in [ s, s + t] and hence the bed surplus of region r. This section presents a load balancing rule for fair allocation of patients to the hospitals in a region.
First, if θ r,αr (s, t ) ≥ 1 , we invoke Decision Rule 1 with patient arrival rates (7) to determine the bed surplus for each hospital h in region r and add these numbers to obtain the region's bed surplus in ward, resp. ICU, in [ s, s + t] at safety levels α r as Second, consider the case θ r,αr (s, t ) < 1 . At safety levels α r , the remaining fraction of patients arriving at rate [ 1 − θ r,αr (s, t ) ] r (u ) must be accommodated in hospitals outside region r for each u ∈ [ s, s + t] . Our model requires discrimination between the number of patients admitted at the ward and the ICU. To this end, observe that θ h,α hW ,α hI (s, t ) /θ r,αr (s, t ) is the fraction of patients that would be admitted to hospital h if all hospitals would have ample capacity. For each u ∈ [ s, s + t] the fraction of patients with characteristics c admitted to all wards of hospitals in region r at safety levels α r may be obtained as If the fractions p hc (u ) for all hospitals in the region coincide, say p hc (u ) = p c (u ) , then (9) reduces to p rc (u ) = p c (u ) . Let M rW,αr (s, t ) , resp. M rI,αr (s, t ) , be the regional bed shortage in ward, resp. ICU, in [ s, s + t] at safety levels α r . Then, M rW,αr (s, t ) , resp. M rI,αr (s, t ) , are Poisson distributed random variables with means m rW,αr (s, t ) , resp. m rI,αr (s, t ) : (11) so that the expected regional bed shortage in ward, resp. ICU, in Combining the results above with those of Section 2.2.1 we obtain the following load balancing decision rule to allocate patients to hospitals in a region and determining the regional bed shortage or surplus.   [17] , and see [16] for a general reference on load balancing for loss networks .

Merging all wards and all ICUs in a region
This section introduces regional control of COVID-19 beds. Merging the ICU bed-capacity of individual hospitals into a regional ICU may considerably reduce the number of patients reallocated out of the region, see [48] . We will exploit the so-called statistical multiplexing gain [15] , and merge all wards, resp. ICUs, into a single (virtual) regional ward, resp. ICU.
Assume that the hospitals h = 1 , . . . , H r in region r agree on regional safety levels α rW and α rI for their wards and ICUs and (virtually) merge their COVID-19 wards, resp. ICUs, into a single regional ward and ICU, with capacities We may now view the region as a single hospital (as described in Section 2.1 ) with autonomous arrival rates rc (u ) , c ∈ C, where a fraction P rc (u ) , resp. 1 − P rc (u ) is admitted to the regional ward, resp. ICU. Let N rW (u ) , N rI (u ) record the number of patients present in the (virtual) regional COVID-19 ward and ICU at time u , respectively. The α rW -quantile, n rW,α rW (s, t ) , and the α rIquantile, n rI,α rI (s, t ) , for the maximum occupancy in [ s, s + t] follow by analogy with the single hospital model of Section 2.1 . For region r, (5) determines the fraction of autonomous regional arrivals that may be accommodated by hospital h , h = 1 , . . . , H r , at its safety levels. Cooperation among the hospitals allows a more refined rule to distribute patients over the hospitals of a region. To this end, first observe that if region r accepts autonomous arrivals to its ward, resp. ICU, at safety levels α rW , resp. α rI , then the hospitals in the region must have sufficient beds to accept these patients in their wards and ICUs. We may, therefore, at safety levels α rW , α rI , distribute patients according to the fractions θ hW,α rW (s, t ) , θ hI,α rI (s, t ) defined in (8) , while still avoiding overcrowding of the wards and ICUs, recall Remark 2 . Region r allocates the fractions of the autonomous arrivals that are hospitalised in region r to the ward, resp. ICU, of hospital for small > 0 . Patients are then allocated uniformly at random to hospitals where θ hW,α rW > 0 , and similarly for the ICU.
Combining the results above we obtain the following decision rule for allocation of patients to hospitals in a region and determining the regional bed shortage or bed surplus.
If n rW,α rW (s, t ) > n * rW (s, t ) , report a bed shortage in the wards of region r in [ s, s + t] . In both cases, allocate a fraction θ hW,α rW (s, t ) of the regional autonomous arrivals that are hospitalised in the wards in region r to hospital h . For the ICU, the rules above apply with W replaced by I. Decision Rule 2 uses the safety levels of the individual hospitals to determine the bed surplus for each hospital, whereas Decision Rule 3 uses the safety levels of the merged hospitals to determine the regional bed surplus. A merged hospital requires fewer beds to accommodate patients at the same safety level, see [15,49] and therefore the bed surplus under Decision Rule 3 exceeds the bed surplus under Decision Rule 2 .

Patient reallocation across multiple regions
This section builds on the bed surplus and bed shortage for individual regions as determined under Decision Rule 2 or 3 to obtain a decision rule for (part of) a country consisting of R regions with n * rW (s, t ) , resp. n * rI (s, t ) , beds in the ward, resp. ICU, in [ s, This decision rule determines the number of patients to be reallocated across the regions at each decision epoch s , taking into account the current number of hospitalised patients as well as the maximum number of patients hospitalised in the regions in [ s, s + t] . Patients that cannot be accommodated in the regions may be reallocated to an external region. Let ˜ n rW (s ) , resp. ˜ n rI (s ) , be the current (at epoch s ) bed shortage ( ˜ n < 0 ) or bed surplus ( ˜ n ≥ 0 ) in the ward, resp. ICU, of region r. Let the random variables ˜ Reallocation of a patient from region r to region r incurs cost γ r,r incorporating, for example, travel distance for reallocation of the patient or travel distance for his or her relatives, and differences between regions r and r with respect to the number or size of hospitals.
We may impose the (strict) triangle inequality on the costs γ r,r : to avoid that region r 2 functions as an intermediate stop for reallocation of patients from region r 1 to region r 3 , which may happen, for example, if γ r 1 ,r 2 = γ r 2 ,r 3 = 2 and γ r 1 ,r 3 = 5 , which is excluded by (13) .
We will now develop a recourse model with objective to minimise the costs of patient reallocations across regions at the current decision epoch s as well as during [ s, s + t] such that (i) patients are distributed over regions such that the current bed shortages are resolved, taking into account the bed shortages in [ s, s + t] , and (ii) the relative remaining bed surplus (detailed below) is balanced over the regions.
The here-and-now decision variables f W,r,r (s ) , resp. f I,r,r (s ) , are the number of ward, resp. ICU, patients to reallocate from region r to region r at decision epoch s . The wait-and-see decision variables F W,r,r (s, t ) , resp. F I,r,r (s, t ) , are the number of potentially required additional reallocations among the wards, resp. ICUs, of regions r and r in [ s, s + t] based on the bed surplus or shortage ˜ for all regions r. To penalise the imbalance in bed surplus in the wards and ICUs across the regions, we introduce a penalty function g(·) (e.g. g(x ) = x 2 or g(x ) = x ). The optimal number of patients f W,r,r (s ) , resp. f I,r,r (s ) , reallocated from the ward, resp. ICU, of region r to the ward, resp. ICU, of region r , r, r = 1 , . . . , R + 1 , is determined as the argmin of the following recourse model: (Here-and-now constraints, resolve current shortages in original ward and ICU) (Here-and-now constraints, don t cause shortages in destination ward and ICU) (Here-and-now constraints, transport to external region only if strictly necessary) (Wait-and-see constraints, anticipate on future shortages in ward and ICU) where we have used the additional notation for the current and future relative remaining bed surplus δ W, .
Note that the wait-and-see and domain constraints hold point-wise on the sample space, i.e., for every possible outcome of ˜ The stochastic program does not directly incorporate Decision Rule 2 or 3 that are amenable for practical implementation. To incorporate these rules in our optimisation approach, we approximate (P) by the integer program described below that is based on the safety levels α rW , α rI , and the expected shortages or surpluses reported by the regions.
The forecast shortages are resolved by the wait-and-see variables f W,r,r (s, t ) , f I,r,r (s, t ) according to wait-and-see constraints: (Wait-and-see constraints, anticipate on future shortages in ward and ICU) The wait-and-see outcomes then induce future relative remaining bed surpluses The objective is now reformulated as min r,r γ r,r ( f W,r,r (s ) + f I,r,r (s ) + f W,r,r (s, t ) + f I,r,r (s, t ) ) + R r=1 g(δ W,r (s )) + g(δ I,r (s )) + g(δ W,r (s, t )) + g(δ I,r (s, t )) (P )

Predicting arrival rates and forecasting bed occupancy
The effectiveness of the decision rules developed in Section 2 relies on an accurate real-time forecast of the COVID-19 bed occupancy, and therefore on an accurate prediction of the arrival rate and estimation of the LoS. This section considers prediction of arrival rates for a region and a single hospital, as well as generation of bed occupancy forecasts.
For each ROAZ region, the number of positive COVID-19 tests is available on a daily basis on the website of the Dutch National Institute for Public Health and the Environment (RIVM) [50] . In this data set, the number of infections on a given day represents the number of people that (retrospectively) tested positive for COVID-19 on that day. In addition to infection data, national hospital admission data per region, collected by the Dutch foundation for National Intensive Care Evaluation (NICE) is available on the website of the RIVM [51] . The number of admissions per day represents the number of patients who have tested positive for COVID-19 and are admitted to a hospital in the respective region. For our arrival rate prediction, we focus on data of the ROAZ region Netwerk Acute Zorg (NAZ) West , containing the hospitals Groene Hart Ziekenhuis (GHZ), HagaZiekenhuis (Haga) and Leiden University Medical Center (LUMC). To evaluate the accuracy of our prediction, in this section we focus on Haga, a 600-bed hospital in The Hague that admits approximately 29,0 0 0 inpatients per year. For this hospital, relevant data is available to us from September 4, 2020 until January 31, 2021.

Prediction of the arrival rates
In this section, we develop a new arrival rate predictor based on regional infection data, and compare its performance to the predictor in Baas et al. [40] that was shown to result in accurate bed occupancy forecasts. The new predictor makes explicit use of the delay between COVID-19 infection and hospitalisation, and enables us to predict the arrival rate at the (virtual) merged regional ward and ICU.
An increase in the number of COVID-19 infections results some days later in an increasing number of hospital admissions. As a consequence, we may expect a filtration and time-delay between the number of infections and the number of hospitalised patients. We assume the number of infections to be a Poisson process with time-dependent rate, so that the resulting autonomous regional arrival rate of COVID-19 patients is again a Poisson process, see, e.g., Chiu et al. [52] . We estimate the time-delay and filtration between COVID-19 infections and hospitalisations directly on the data for NAZ West in the RIVM data set [50] . The time-delay that gives the best fit is defined as the value between two and fourteen days resulting in the minimal mean squared error (MSE) when performing ordinary least squares (OLS) between a weekly moving average of the regional infections and a delayed exponentially weighted moving average (coefficient 0.1) of the regional hospitalisations [51] . The filtration that gives the best fit is the coefficient resulting from the OLS procedure corresponding to the optimal time-delay.
The result of this time-delaying and filtration procedure applied to NAZ West data is shown in the top graphs in Fig. 1 . The black dots in the top left, resp. top right, graph of represent the realised number of infections, resp. hospitalisations, per day in NAZ West. The red line in the top left graph shows the 7-day moving average (MA) of the number of infections and the red line in the top right graph shows an exponentially weighted moving average (EWMA; coefficient 0.1) of the number of hospitalisations, both to indicate the trend. These trend lines already reveal the time-delay and filtration between the number of infections and hospitalisations. The purple line in the top right graph is the result of the time-delaying and filtration procedure applied to the (averaged) daily number of infections displayed as the red line in the top left graph. The best-fit time-delay equals 7 days, which is in accordance with a recent study performed in Belgium [53] , where an average time-delay of 5.74 days is estimated, with medians ranging from 3 to 10.4 days, depending on patient characteristics. The best-fit filtration factor is found to be 3 . 1% . The extremes of the purple line in the top right graph of Fig. 1 coincide with the extremes of the number of hospitalisations (red line), but additional fine-tuning is required since the extremes over-and undershoot those of the number of hospitalisations.
To this end, we develop an t-days ahead prediction of the number of hospitalisations displayed (for t = 3 ) as the purple curve in the bottom left graph in Fig. 1 . The static predictor (purple line in the top right graph) is corrected by estimating the scaling factor of the infections using weighted least squares between delayed infection and arrival data up to time s . The weights used for this least squares procedure are normalised exponential weights with base 1.2 so that errors in the fit for recent hospitalisations are penalised more than those for earlier hospitalisations. The effect of the weighted least squares procedure is that for each time s the t-days ahead prediction starts around the trend in the number of hospitalisations at time s . Hence, in our t-days ahead prediction the daily number of infections up to time s have a larger influence on the slope of the fine-tuned purple curve in the bottom left graph than on the starting point for the prediction determined by the number of hospitalisations (orange). This is motivated by the observation that the regional fraction of hospitalised patients Top left: Regional infections (black) and 7-day moving average (MA) of the realisations (red). Top right: Regional hospitalisations (black), exponentially weighted moving average (EWMA; coefficient 0.1) of realised regional hospitalisations (red) and filtered/scaled 7-day moving average of infection data (purple). Bottom left: Regional hospitalisations (black), EWMA (coefficient 0.1) of realised regional hospitalisations (red), 3-days ahead expanding window predictions of the arrival rate by the Richards' curve model (orange) and 3-days ahead expanding window predictions of the arrival rate from regional infections (purple). The purple line is made thinner to distinguish it from the other two (this is also the case in Fig. 3 ). Bottom right: Autonomous arrivals to hospital Haga (black), EWMA (coefficient 0.1) of realised autonomous arrivals (red), 3-days ahead expanding window predictions of the arrival rate by the Richards' curve model (orange) and 3-days ahead expanding window predictions of the arrival rate from regional infections (purple). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) is determined to a large extent by unknown factors (e.g. current regional shortage/surplus), while daily infections remain a good indicator for whether hospitalisations will go up or down. In addition, we have implemented an improved version of our t-days ahead prediction using a 5-parameter Richards' curve [40] . The improvements include the possibility of estimating a mixture of multiple Richards' curves, and to return a logistic, exponential or linear fit through (early-stage) data if this results in a smaller mean squared error (similar to the procedure described in Lee et al. [54] ). The Richards' curve predictions were quite sensitive to outliers occurring around the time of prediction. To account for this, the Richards' curve was estimated on the arrival data up to point s , augmented with 7 days of arrival data set equal to the weighted average of the number of arrivals at time s .
The bottom row of Fig. 1 displays 3-days ahead expanding window predictions [40] of the arrival rate for NAZ West (left) and Haga hospital (right) in the period September 4, 2020 until January 31, 2021. The bottom left graph also includes the hospitalisations (dots) and the trend (red) from the top right graph. The orange line corresponds to the expanding window prediction of the arrival rates by the Richards' curve estimator. The purple line corresponds to the predictions generated from the daily number of infections. Observe that the predictions resulting from the infection data are less extreme when compared to those made by the Richards' curve predictor. This sensitivity of Richards curve forecasts to the prediction date (especially before the inflection point) was also seen in Wu et al. [55] . Furthermore, observe that the daily infections predictor shows an earlier increase around mid December, when the number of hospitalisations starts to increase again. We conclude that the prediction based on daily infections outperforms the prediction based on the Richards' curve using patient arrivals in the Hospital Information System.

Forecast of bed occupancy
This section investigates the forecasting power of bed occupancy of our method and compares it to the improved version of the Richards' curve predictor of [40] . To evaluate the maximum possible gain in forecasting power, we compare our results with an oracle predictor of the arrivals, that uses the actual realised patient arrivals to forecast bed occupancy. The forecasts of bed occupancy used in this section are obtained by the sampling method presented in Baas et al. [40] for patient trajectories in the Poisson Arrival Location Model (PALM). We consider 3-days ahead expanding window forecasts for the daily and maximum occupancy for the Haga hospital in the period September 4, 2020 until January 31, 2021 as displayed in Fig. 2 , and include Table 1 containing coverage rates, bias and MAE for the forecast occupancy.
The top row of Fig. 2 contains daily occupancy forecasts made by the oracle forecaster (cyan) and the daily infections forecaster (purple) for the ward (left) and ICU (right). The forecasts for the ICU lie closer to each other (Pearson correlation coefficient 0.96) than those for the ward (Pearson correlation coefficient 0.91). This might be due to the smaller number of direct autonomous arrivals to the ICU ( 28% ) than to the ward ( 90% ). Hence, the difference in predicted arrival rates is expected to have the largest influence on occupancy forecasts for the ward. Note that the pattern seen in the oracle forecast for the ward is also seen in the daily infections predictions with a delay of three days, which makes sense as we are considering a 3-days ahead expanding windows forecast.
The middle row of Fig. 2 shows forecasts for the Richards' curve (orange) and daily infections (purple) along with the realised occupancy (red) in the ward (left) and ICU (right). The reported occupancy excludes patients reallocated from/to other hospitals. The two forecasts lie very close to each other (Pearson correlation coefficient 0.97 for the ward and 1.00 for the ICU), and close to the oracle. In accordance with the predictions for the arrival rates, the Richards' curve forecasts have a more fluctuating behaviour in comparison to the daily infections forecasts. The forecasts are close to the realised daily occupancy with a delay of 3 days and have a higher forecasting power when the occupancy decreases. A detailed comparison is included in Table 1 . The difference between occupancy forecasts is again largest for the ward, which can be explained by the larger fraction of direct arrivals at the ward. Observe that the fluctuations in the arrival rate predictions in Fig. 1 are dampened in the forecast in Fig. 2 , which may be partially explained as the load is obtained as integral over the arrival rates [46,Theorem 1.2] .
The bottom row of Fig. 2 shows forecasts of the maximum occupancy over 3 days and their realisations in the ward (left) and ICU (right). The forecasts for maximum occupancy lie very close to each other for both ward (Pearson correlation coefficient 0.99) and ICU (Pearson correlation coefficient 1.00). Clearly, fluctuations in the maximum occupancy over 3 days are lower than those for daily occupancy predictions. Table 1 presents a detailed overview and comparison of the quality of the forecasting results for our Richards' curve, daily infections and oracle forecasting methods for COVID-19 bed occupancy. We show results for coverage rate (CR), bias and mean absolute error (MAE), of which MAE is the most important measure as it captures the average absolute difference (distance) between forecast and realisation. For the ICU, we observe that the Richards' curve and regional infections forecast show similar results for all measures, and are outperformed by the oracle forecaster as is to be expected since the oracle forecaster uses the exact values for the number of patients in the hospital. For the ward, with respect to MAE an ordering can be found in the quality of the forecast: the oracle forecast outperforms the daily infections forecast, which in turn outperforms the Richards' curve forecast. This is not seen when looking at the 1-day ahead MAE for the Richards' curve and daily infections forecasts, as the Richards' curve is explicitly designed to extrapolate the current trend. The regional infections forecast also outperforms the Richards' curve forecast for CR and bias when forecasting the maximum 3 −, 5 − and 7-days ahead. Note that the CR increases and bias decreases in the horizon for the ward for all forecasts. Apparently, the maximum occupancy forecast tends to overestimate the realisation for shorter horizons, while this bias lessens for longer horizons. Observe that the oracle forecast shows a larger bias in the expected maximum occupancy forecasts. This is because maximum occupancy forecasts are always higher than or equal to the current occupancy. Hence forecasts for the maximum often point in the right direction (less negative bias) when occupancy increases and in the wrong direction (more positive bias) when occupancy decreases. This effect occurs to a larger extent for the oracle, as it can better forecast increases in occupancy. As we are mainly interested in forecasts of the maximum occupancy, we conclude that the daily infections forecast is to be preferred over the Richards' curve forecast and will be used in subsequent sections.

Remark 7 (Possibility of Overfitting) . A question that might arise is whether there might be an overfitting issue under the proposed method.
While this is indeed a possibility when fitting the arrival predictor, this cannot be the case for the occupancy forecasts as we do not directly use occupancy data in the forecasts. Furthermore, all our evaluations are expanding window out-of-sample forecast evaluations, meaning that we never train and evaluate the model on the same data. Looking at the outcomes for the Haga data, the Richards ' curve and regional infections forecasts for the realised arrivals are likely not an overfit on that data set as the forecasts follow the trend and clearly have less variance than the actual realisations, which would be the case with an extreme overfit .

Numerical results
This section presents numerical results illustrating the performance of our hierarchical model that fairly balances COVID-19 patients over hospitals in a region and across regions. Section 4.1 considers allocation of patients to hospitals within a region, and Section 4.2 considers optimal reallocation of patients across regions.

Allocating regional COVID-19 patients to hospitals in a region
This section compares the impact of the three levels of regional coordination described in Section 2.2 : individual hospitals ( Section 2.2.1 ), load balancing ( Section 2.2.2 ), and merging all hospitals into a regional hospital ( Section 2.2.3 ) on the allocation of patients to hospitals in a region via a simulation study. Table 2 Capacity and initial occupancy for each hospital used for the simulation study of Section 4.1 . The capacity chosen here is not directly based on the actual capacity at the hospitals and merely chosen as such for the simulation study.

Capacity
Init. occupancy Hospital  Ward  ICU  Ward  ICU   LUMC  29  11  22  7  Haga  41  17  30  13  GHZ  31  13  25  8 We consider occupancy by COVID-19 patients in the period 8 November 2020 until 7 January 2021 (60 days) at three hospitals in ROAZ region NAZ West: Groene Hart Ziekenhuis (GHZ), HagaZiekenhuis (Haga) and Leiden University Medical Center (LUMC). Each hospital is modelled as described in Section 2.1 . The probability of class assignment and class-dependent LoS distributions are estimated using the estimation procedure from Baas et al. [40] from pooled data collected from the hospital data warehouses of the three hospitals. For each patient class, the same fixed allocation probability and LoS distribution is assumed for each hospital as well as for the merged regional hospital. Table 2 presents the number of beds and initial occupancy for each hospital that are used in our simulation study. The number of beds chosen here is of the order of magnitude of the actual capacity at the hospitals, but is chosen fixed to reveal the differences in the allocation methods. In practice, the number of beds may have fluctuated over the days.
The arrival process of patients to the ward and ICU is found to be a non-homogeneous Poisson process with arrival rates displayed in Fig. 3 . These arrival rates are obtained using an exponentially weighted moving average (coefficient 0.3 for the ward and 0.1 for the ICU), and scaling (0.95) of the daily hospitalisations, obtained as described in Section 3.1 . In the considered period, on average around 12 ward patients arrive per day with a fluctuating rate, and one ICU patient arrives every three days, with fewer patients arriving towards the end of the considered period.
Using the arrival processes, LoS distributions and transition probabilities, the PALM of the system of infinite server queues corresponding to each hospital can now be simulated on a day-to-day basis according to three levels of regional coordination, using the Table 3 Average number of over-bed days and over-beds (with 95% confidence interval) for safety level 0.9 for each hospital.  [40] . The time of evaluation for each day is set to 10 AM, in accordance with the time that Dutch hospitals report their occupancy. The three levels of regional coordination are as follows: • Individual Hospitals: Patients are randomly allocated to hospitals according to fixed probabilities equal to fraction of COVID-19 patients allocated to that hospital over the evaluation period: 0.30 for LUMC, 0.43 for Haga and 0.27 for GHZ. • Load Balancing: Patients are allocated to hospitals according to the load balancing Decision Rule 2 . The estimation procedure for θ h,α hW ,α hI (s, t ) is given in Appendix A and is based on the daily infections predictor, see Section 3 . The safety levels α hW , α hI were set to 0.9 for each hospital. When θ h,α hW ,α hI (s, t ) = 0 for each hospital the allocation probabilities are set equal to those under the rule "Individual Hospitals". • Regional Hospital: All wards and ICUs are merged into a regional ward and ICU as described in Section 2.2.3 . Patients are distributed over the hospitals according to Decision Rule 3 .
In the simulation study, all patients are distributed over beds at hospitals in the region, and hence cannot be reallocated outside the region. If the capacity of a hospital is exceeded, patients stay at a so-called over-bed until the hospital's bed shortage is resolved. An over-bed is an originally unequipped, non-staffed bed which is forcefully brought into operation.
For the three coordination levels, Table 3 presents average Key Performance Indicators (KPIs) with 95% confidence intervals based on Student's t-distribution. The averages and confidence intervals are calculated based on 250 independent simulation replications under each policy, generated as described above. For each hospital, the number of over-bed days (days with a bed shortage) was determined for the ward and ICU. This KPI does not reveal the number of over-beds. To this end, the average daily number of over-beds (averaged number of occupied over-beds per day averaged over the evaluation period) was also determined for both departments at each hospital. Under the rules "Individual Hospitals" and "Load Balancing" the total number of over-bed days and over-beds for the region was determined by summing the KPIs for the individual hospitals. The last column of Table 3 includes only the number of over-bed days and the number of over-beds for the region as under Decision Rule 3 patients are allocated to an over-bed only if none of the hospitals in the region has a bed surplus.
We observe a clear ordering in the performance of the allocation rules. At all hospitals, Load Balancing yields a significant reduction of the number of over-bed days of around 50% for the ward and around 60-75% for the ICU when compared to Indi-vidual Hospitals. This also holds for the average number of overbeds with an approximate 60-70% reduction for the ward and 80-90% reduction for the ICU. Regional Hospital further significantly improves performance, reducing the number of over-bed days to around 2 for the regional ward and ICU, with on average only 0.22 (ward) and 0.12 (ICU) over-beds needed per day. Appendix B contains a plot displaying the evolution of the occupancy under the regimes Load Balancing and Individual Hospitals, as well as a plot of the evolution of the allocation probabilities under Decision Rule 2 . These results show the clear advantage of regional collaboration.

Reallocating COVID-19 patients across regions
This section considers four policies for reallocating patients across regions to alleviate bed shortages. We consider occupancy in the period 8 November 2020 until 7 January 2021 (60 days) for four ROAZ regions: Acute Zorgregio Oost (region 1), Netwerk Acute Zorg Brabant (region 2), Netwerk Acute Zorg West (region 3) and Traumazorgnetwerk Midden-Nederland (region 4), augmented with an external region. The LoS and transition probabilities are estimated from the data warehouses of a representative large hospital in each region. To focus on the reallocation across regions, each region is modelled as a (virtual) regional ward and ICU, as described in Section 2.2.3 .
Patients arrive autonomously to either the ward or ICU in each region according to the arrival rates displayed in Fig. 4 . These arrival rates are predicted using the procedure outlined in Section 3.1 using an exponentially weighted average (coefficient 0.3) of the hospital admissions reported in the data set with daily infections for each ROAZ region [50] . This regional arrival rate is then multiplied with the historical fraction (from the hospital data warehouses) of patients allocated to the ward and ICU to obtain the autonomous arrival rates to the wards and ICUs. To take into account that COVID-19 beds are occupied by both COVID-19 confirmed and non-confirmed COVID-19 patients, the hospitals' arrival rates are scaled to obtain autonomous arrival rates close to those reported in (for instance) the dashboard [56] for the evaluation period. The capacity and initial occupancy, as well as the scaling factor used for the arrival rate for both departments at each of the regions is given in Table 4 .
If the occupancy in a region exceeds capacity (measured at 10 AM, the evaluation time), a patient has to be reallocated to another region. We consider the following policies: • Any Surplus: A region has sufficient surplus at a department (ward or ICU) when the surplus of beds exceeds a safety threshold k as , which in our experiments is set to k as ∈ { 0 , 1 , 2 } .  This minimal surplus is included to guarantee bed-capacity for autonomous arrivals in the region during the day. If a region r has bed shortage in ward and/or ICU at the evaluation time and other regions have sufficient bed surplus at the respective departments, a patient from region r is reallocated with equal probability to any of the regions having sufficient surplus at the respective departments. Decisions for reallocation of patients are taken one-by-one, such that for the next patient reallocated the occupancy of the regions takes into account the previous reallocation decisions. If after patient reallocation to a region the surplus of the department in that region is no longer sufficient (i.e., less than k as ), the department at that region is no longer considered for reallocation. If there is no region with a sufficient surplus for the respective department, the patient is reallocated to the external region. • Number of Beds: This policy differs from the Any Surplus policy in that patients are reallocated with equal probability to any of the available (surplus) beds in the regions with a sufficient surplus. The effect of this is that regions with a larger bed surplus at the respective department have a larger probability to receive patients. • Stochastic Program: This policy reallocates patients to regions according to the here-and-now decision coming from Program (P), which is a stochastic program. To approximate the optimal solution to the stochastic program, sample average approximation is used for the second stage. The scenarios are determined from Decision Rule 3 using 10 0 0 samples of the maximum occupancy obtained from the PALM of the system of infinite server queues for the daily infections forecast and a horizon of 3 days. The costs γ r,r of patient reallocation from region r to r are given in Table 5 . This cost matrix indicates that re-gions 1, 4 and regions 2, 3 lie close to each other, i.e., the regions are clustered in clusters of two. Next, we take g = x → x 2 . • Integer Program: This policy reallocates patients to regions according to the here-and-now decision coming from Program (P ), which is an integer program. The forecasts are determined from Decision Rule 3 using 10 0 0 samples of the maximum occupancy obtained from the PALM of the system of infinite server queues for the daily infections forecast and a horizon of 3 days. The costs of patient reallocation and g are the same as in the stochastic program.
Given the initial occupancy, arrival rates, LoS distribution and probabilities of transfers to other departments, the PALM of the system of infinite server queues corresponding to each region modelled as a single hospital is simulated according to the method described in Baas et al. [40] on a day-to-day basis for each day in the evaluation period. The results of the simulation study are reported in Tables 6 and 7 . The policies are evaluated on the average (over 250 replications) total amount of reallocated patients (Total), amount of patients reallocated to the external region, amount of patients reallocated across regions within the clusters (across regions 1 and 4 and across regions 2 and 3, In clusters), between clusters (Btw. clusters) and the average cost per reallocation, defined as s,r,r γ r,r f W,r,r (s ) / s,r,r f W,r,r (s ) for the ward and by analogy for the ICU. To evaluate the significance of differences in the averages, the 95% confidence interval based on Student's tdistribution is also shown in the tables. Table 6 includes results for k as = 2 , i.e., for regions that reserve 2 ward and ICU beds for autonomous same-day admissions. The four policies do not show a significant difference in the average total number of reallocated patients in the wards and ICUs (confidence intervals are overlapping), but do show a clear difference in number of patients reallocated to the external region, and most importantly, both optimisation models considerably reduce the number of patients reallocated out of a cluster (by roughly 50%) compared to the other methods and thereby also the average reallocation cost. Most of the KPIs do not differ significantly between the stochastic program (Stoch. Prog.) and the integer program (Int. Prog.). It is seen that under the stochastic program, more patients are reallocated across regions in the clusters, and on average, more patients are reallocated in total. Further, there is a signif-  1  10  50  50  10  100  Region 2  50  10  10  50  100  Region 3  50  10  10  50  100  Region 4  10  50  50  10  100  External region  100  100  100  100  100   Table 6 Average number of patient reallocations across regions (with 95% confidence interval) for k as = 2 .
icantly lower average reallocation cost. This is due to the stochastic program being less conservative than the integer program. Table 7 presents the results of bed reservation via the thresholds k as for the Number of Beds policy (that outperforms the Any Surplus policy) and the integer program. With increasing k as the number of patients reallocated to the external region increases, as is to be expected. Our optimisation model slightly outperforms the Number of Beds policy. The Number of Beds policy may be a good heuristic if hospitals and regions may be convinced to not use safety beds, that are not required in our collaborative optimisation policy.
In conclusion, the integer program is a good approximation of the stochastic program, that avoids reallocations to the external region and is able to reallocate patients to closer regions, while keeping the total number of reallocations at roughly the same level.
Sensitivity analyses, evaluating the results presented in this and the previous subsections under different horizons and safety levels are presented in Appendix C .

Discussion and conclusion
This paper has introduced mathematical models and decision rules for dynamic fair balancing of COVID-19 patients over hospitals in a region and across regions. Patient flow is captured in the Poisson Arrival Location Model (PALM) and the corresponding network of infinite server queues for the ward and Intensive Care Unit (ICU) of a single hospital. The model includes transfers between ward and ICU and allows determining safety levels for ward and ICU bed occupancy and corresponding forecasts of bed surplus or bed shortage in the ward and ICU of each hospital or region. The dynamic fair balancing approach within a region is based on a dynamic predictive load balancing model incorporating a forecast of the occupancy based on publicly available regional infection data and Length of Stay (LoS) and transfer probabilities obtained from the Hospital Information System (HIS). This model extends load balancing models in literature to include real-time estimations of the arrival process, service and routing processes and their impact on forecast occupancy. The dynamic fair balancing model across regions is a stochastic program that may be accurately approximated by a mixed integer program taking into account forecasts of the future bed surpluses or shortages. It hence takes into account both the current occupancy and the forecast maximum occupancy over the next couple of days.
Our mathematical model is augmented by accurate statistical methods to predict patient arrivals, estimate LoS and transfer probabilities. For LoS and transfer probabilities, we have used the Kaplan-Meier estimators for censored data as developed in Baas et al. [40] . For patient arrivals, we have both improved prediction of patient arrivals based on the HIS and Richards' curves that was developed and shown to be very accurate in Baas et al. [40] and developed prediction of patient arrivals based on regional infection data. We have found that the latter provides better results as it captures changing trends in hospitals' arrival rates a few days earlier than the HIS data. In addition, for our dynamic load balancing model, we have developed an estimator of the load balancing dynamic allocation fractions of patients to hospitals in a region. Our forecasting method for bed occupancy is based on simulation of the PALM as developed in Baas et al. [40] using the estimated LoS and transfer probabilities and predicted arrivals based on regional infection data.
Our dynamic fair balancing models and statistical methods yield implementable decision rules for patient allocation to hospitals in a region or reallocation across regions based on safety levels and forecast bed surplus or bed shortage for each hospital or region. We have tested accuracy of our forecast using HIS data from September 4, 2020 until January 31, 2021 of hospitals in the ROAZ region Netwerk Acute Zorg West , containing the hospitals Groene Hart Ziekenhuis (GHZ), HagaZiekenhuis (Haga) and Leiden University Medical Center (LUMC). Our forecast of bed occupancy and of maximum bed occupancy over the next couple of days are shown to be very accurate. Using these forecasts, we have investigated the benefits of three levels of regional collaboration: individual hospitals (or no collaboration among hospitals), dynamic load balancing and merging all hospitals into a (virtual) regional hospital. The regional hospital exploits the statistical multiplexing gain and clearly makes optimal use of available beds, but may include patient transfers from the ward of one hospital to the ICU of another and requires hospitals to give complete control over patient admission to a regional dispatcher. Load balancing allows hospitals to govern their own policy and has clear and substantial benefits with respect to levelling the load over hospitals in the region.
The intra-regional load balancing decision rule may be developed into a decision support tool and incorporated in the ROAZ dashboard for allocating patients to hospitals. First steps in this direction have been set in collaboration with ROAZ region Netwerk Acute Zorg West . We have explored optimal reallocation of patients across regions based on current and forecast load in the regions and found that our decision rule that takes into account reallocation costs across regions and the current and forecast load in the regions results in fewer reallocations to regions far away. This inter-regional reallocation rule requires the same information as shared with the Landelijk Coördinatie-centrum Patiëntenspreiding (LCPS) and may be developed into a decision support tool for patient reallocation.
In addition to developing our results into decision support tools, several points for further research or improvement may be addressed. In our simulation study we considered a fixed decision epoch at 10 AM each day. As a consequence, a patient arriving inbetween two decision epochs is admitted to an over-bed until the next decision epoch. Immediate reallocation of this patient may be included in our simulation approach. However, this requires a real-time update of new admissions, discharges and reallocations among hospitals for all hospitals or all regions. As this results in increased dependence among decision epochs, a Markov decision process approach might also be investigated. In our hierarchical model, we split the decisions for inter-regional reallocations and load balancing within the region. As long as we consider the region as a single hospital, this does not influence the number of interregional reallocations since the bed surplus/shortage of a region is independent of (and determined prior to) the load-balancing allocation of patients to hospitals. Integrating the intra-regional and inter-regional decision levels is an open question for further research.
Given the quality of our forecasts, the significant reduction in reallocations to distant regions and the significant improvement of balanced load among hospitals within a region, we are confident that our decision rules provide an important step towards practical implementation of a decision support tool for real-time reallocation of COVID-19 patients. Moreover, our methodology may also be beneficial for patient reallocation during future pandemics or national outbreaks, with fine-tuning of the statistical methods. Our mathematical models are generic and not specific to COVID-19; all they require is data from which patient arrival rates can be predicted as well as in-hospital data on patient transfers and discharges. Lastly, we envision our dynamic load balancing procedure to be applicable well beyond the scope of patient reallocation. We aim to unlock this potential in future research, by incorporating our dynamic load balancing procedure in a generic queueing framework.

Data availability
Data will be made available on request. and scaling factor θ = exp (χ h,n ) for the arrival rate. The trajectories are independent over i, n and can be sampled according to the PALM simulation method given in Baas et al. [40] . The parameter M in denotes a number of inner simulations, which are performed for each iterate in (14) . From the above, it can be seen that ˜ π hW,n ( exp (χ h,n )) , ˜ π hI,n ( exp (χ h,n )) are bounded random variables with expectation equal to π hW ( exp (χ h,n )) , π hI ( exp (χ h,n )) . (14) is a Robbins-Monro sequence [57] , for which we verify (almost sure) convergence below. Let η(x ) = E [ min ˜ π hW, 1 ( exp (x )) , ˜ π hI, 1 ( exp (x )) ] .

The sequence defined by
If η(x ) = 0 for some x and π hW (0) , π hI (0) ≥ 0 , all assumptions in Blum [58] are satisfied, so that (χ h,n ) n converges (almost surely) to a constant limit χ h such that η(χ h ) = 0 . Next, consider the case η(x ) = 0 for all x ≥ 0 . It can be shown that η is a decreasing, differentiable function with lim x →∞ η(x ) = min (−α hW , −α hI ) . Hence if η(x ) = 0 for all x ≥ 0 , we have that η is negative everywhere. The minimum in (14) can be decomposed in η(χ h,n ) and a martingale difference h,n (χ h,n ) , from which it follows that χ h,n is the sum of an almost surely converging martingale (bounded in L 2 as n α(n ) 2 < ∞ ) and a deterministic series with negative increments. From this, it follows that χ h,n → −∞ almost surely. Now, set θ h,n = exp (χ h,n ) . By the above discussion it follows that either θ h,n → 0 in the case of bed shortage or min E ˜ π W (θ h,n ) , E ˜ π I (θ h,n ) ≥ E [ min ˜ π W (θ h,n ) , ˜ π I (θ h,n ] = η(χ h,n ) → 0 (15) in the case of bed surplus at hospital h . Hence, in the latter case, the limit of the sequence θ h,n satisfies the condition given in (5) . We estimate θ h,α hW ,α hI as the almost sure limit of θ h,n in the manner described below.
For each day s , we sample the sequence defined in (14) with χ h, 0 the logarithm of θ h,α hW ,α hI obtained for day s − 1 . For day 0, we set χ h, 0 equal to the logarithm of the historical fraction of COVID-19 patients allocated to the hospital. We chose step-size a (n ) = n −0 . 51 (satisfying the Robbins-Monro conditions [57] ) and M in = 5 , which was seen to result in fast convergence of the iterates to a stationary point. Convergence of the sampler is assessed by checking for every batch of 300 iterations whether the batchmean of exp (χ h,n ) is smaller than 10 −5 or whether the batch-mean of the residuals min ˜ π hW,n ( exp (χ h,n )) , ˜ π hI,n ( exp (χ h,n )) is smaller than 0.01. After diagnosing convergence on a batch of iterations, θ h,α hW ,α hI is estimated as the mean of samples θ h,n for that batch.
Appendix B. Sample path of hospitals' occupancy in single region Fig. 5 presents a sample path of the ward and ICU occupancy for all hospitals in a region for the coordination levels Load Balancing (left) and Individual Hospitals (right) under the simulation study described in Section 4.1 . Jumps occur on a daily basis, starting at November 8, 2020, 10:00. The evolution of the allocation probabilities is included in the bottom row of the figure.
Under Load Balancing, an inverse proportional relation may be observed between occupancy at a certain hospital and its allocation probability. Note that as θ h,α hW ,α hI (s, t ) aims to control both the occupancy in the ward and ICU, θ h,α hW ,α hI (s, t ) is low (often zero) when one of these departments reaches a bed shortage. It is seen in this sample path that the ICU at LUMC is often full, hence the corresponding allocation probability is also often set to zero. Note that it seems harder to control the occupancy at the ICU using the allocation probability, as most of the patients at the ICU originate from the ward. During periods when every hospital has a crowded department, the allocation probabilities are seen to have a fluctuating behaviour, often sending all patients to one hospital one day, and the next day to another. This can sometimes lead to a large increase in over-beds, for instance in the ward around January 5, for Haga. Around December 21, 2020, there were bed shortages at the ICU of hospitals GHZ and LUMC and in the ward at Haga, as a result the historical allocation probabilities (0.30,0.43,0.27 for hospitals GHZ, Haga and LUMC resp.) were used around this period. Note that as departments at the hospitals are already over-occupied during this period, setting these probabilities larger than zero will lead to an even larger bed shortage. This is a consequence of the setup in Section 4.1 , where patients have to be admitted to a hospital in the region. In reality, as is also considered in Section 4.2 , patients will be allocated out of the region. From the sample path it can be seen that bed shortages are often resolved quickly, often in a matter of a few days. Under the Individual Hospitals scenario, the allocation probabilities stay constant over time. The result is that there are long periods of bed shortages, as can be seen from Fig. 5 . The ICU is often overcrowded at GHZ, while the ICUs at the other hospitals become almost empty. On a sample path level, the Load Balancing rule indeed seems to show a more balanced behaviour of the occupancy over time.
increases. This could be explained by the fact that the load balancing rule sends patients to Haga most of the time in scenario 3, and if bed shortages occur there almost all patients will go to GHZ as the safety level of LUMC is a lot higher. A significantly lower (higher) load coefficient is seen at LUMC (Haga) for scenario 3 than for scenario 1, while the load coefficient for GHZ decreases slightly. When comparing the KPIs for the region as a whole for the three scenarios, scenario 1 is the preferred choice as it has the lowest amount of over-bed (days) on average.
Next, in Table 9 , results are shown for the simulation study of Section 4.1 for horizons 3 and 5 days. No significant differences ( 95% ) are seen in the KPIs, the load balancing policy is seen to perform slightly worse on average for a horizon of 5 days when looking at the regional numbers. The samples of maximum occupancy over 5 days are larger than over 3 days, hence the load coefficients are significantly smaller for a horizon of 5 days.
Finally, in Table 10 , the results for the simulation study of Section 4.2 are shown for horizons 3 and 5. The policies show a similar performance, no significant differences were found. Slight increases are seen on average when looking at the number of patients reallocated in the clusters and the total number of reallocated patients.